GithubHelp home page GithubHelp logo

twitterstreamapp's Introduction

TwitterStreamApp

The Twitter API provides a streaming endpoint that delivers a roughly 1% random sample of publicly available Tweets in real-time.

Prerequisites:

  • Docker with Linux containers.

Run Project: from the console: docker compose build docker compose up

Without docker: Need to install RabbitMq software. And change appsettitng : "Server": "rabbitmq", to "Server": "localhost"

Structure: TwitterStreamV2App : App to collect Twitter stream and publish collected tweets to RabbitMQ message queue. RabbitMQ : intermediate queue message broker to keep messages in between. TwitterMassagesConsumerApp : Reads messages from RabbitMq and store in memory. Publish every 50 messages reads and provide statistics.

image

Results:

  • Total Tweets count
  • Top 10 HashTags
  • Count each HashTag occurrence
  • Percentile of HashTag occurrences vs all Tweets received.

How to Scale the app to consume 5700 tweets/second:

  • Add load balancing service that will distribute stream between N container with TwitterStreamV2App.
  • Use Kubernetes to spawn N containers of TwitterStreamV2App that will push messages into a RabbitMQ or Similar consider replacing with Kafka.
  • Spawn N TwitterMassagesConsumerApp in Kubernetes to effectively consume messages and store them to DynamoDB or similar concurrent DB.  

How to make results persistent:

  • Each TwitterMassagesConsumerApp will need to save data into a NoSQL database like DynamoDb from AWS.
  • NoSQL Db provides a lock implementation that can be used to update concurrent writes. 
  • To do that replace the implementation of StorageRepository with a NoSQL one.

TODO: Add Unit and integration Tests to cover functionality with testing.

twitterstreamapp's People

Contributors

tsviet avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.