GithubHelp home page GithubHelp logo

asapin / alkonost Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 2.0 908 KB

Simple spam detector for YouTube chats with different UI options

License: MIT License

Rust 100.00%
rust youtube spam-detection tokio-rs reqwest strsim serde serde-json thiserror futures-rs

alkonost's Introduction

Alkonost

Simple console spam detector for YouTube chats.

Monitors a set of YouTube channels, and starts collecting messages as soon as a new chat room opens. All collected messages are then sent to an embedded spam detector and also saved to a database for a further analysis when searching for a ways to improve spam detection.

Consists of several modules:

  • shared - common types and objects used by other modules
  • StreamFinder - monitors YouTube channels for airing and upcoming streams and premiers
  • ChatPoller - loads messages from the YouTube chat
  • ChatManager - collects messages from every open chat room
  • Detector - analyses messages and tries to detect potential spammers
  • DB - saves all messages and desicions, made by Detector to a database (module is not implemented yet)
  • Alkonost - main library, responsible for creating all other modules and re-exporting only functionality, that should be used by UI implementation
  • UI - a collection of UI implementations for Alkonost

All modules, except shared are implemented as independend actors, which should make it easy to scale horizontally in the future, if such a need ever arises.

YouTube API

This app doesn't use YouTube API, and instead tries to emulate the behaviour of a browser. The reason for this decision is that YouTube by default provides only 10 thousand credits a day to spend on requests. 1 request to load new messages from the chat costs 5 credits, and, depending on how active the chat, should be performed every 5-10 seconds. If we assume that on average the app would perform 5 requests per minute, we can estimate that we will spend around 1500 credits per hour.

And some channels either stream for 24/7, for example Lofi Girl, or have streams planned far into the future, that effectively act as a chat rooms for viewers without the need to create Discord server. That's ~36000 credits per day for each such stream/chat room.

Moreover, to get the list of live broadcasts, we would have to use Search API, which costs 100 credist for each request, meaning that we can only make 1 request every ~15 minutes. And then we would have only 400 credits left to actually collect chat messages. And that's only for 1 channel.

Setup

Before using the app, you need to provide several settings: spam detection parameters (DetectorParams struct), user agent to use when making HTTP-requests (RequestSettings struct) and a frequency of how often the app should check for new streams.

For now all those setting are hardcoded inside each UI implementation, but eventually they all should be loaded from a database, and should be accessible for modification at runtime through UI.

You also need to provide a list of channels to track. The app is using channel id when adding a new channel, but some YouTube channels use custom user name instead of channel id (e.g. https://www.youtube.com/user/PewDiePie). In that case you need to open any video from the channel, and then click on the channel's name under the video. This would open the same channel page, but this time instead of custom user name, you'll see channel id in the browser's address bar (e.g. https://www.youtube.com/channel/UC-lHJZR3Gqxm24_Vd_AJ5Yw for PewDiePie).

To see what exactly you need to do, when implementing new UI, please check how simple CLI UI is implemented.

alkonost's People

Contributors

asapin avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.