GithubHelp home page GithubHelp logo

mindis / socialsensor-stream-manager Goto Github PK

View Code? Open in Web Editor NEW

This project forked from matzika/socialsensor-stream-manager

0.0 3.0 0.0 1.94 MB

Monitors a set of social streams (e.g. Twitter status updates) and collects the incoming content.

License: Apache License 2.0

Java 8.72% Perl 38.58% PHP 52.71%

socialsensor-stream-manager's Introduction

socialsensor-stream-manager

Stream Manager monitors a set of seven social streams : Twitter, Facebook, Instagram, Google+, Flickr, Tumblr and Youtube to collect incoming content relevant to a keyword, a user or a location, using the corresponding API that is supported from each service. Twitter API works as a real-time service, whereas the other six act as polling consumers perfoming requests to the network periodically. The framework also provides wrappers to a set of different storages.

Getting started

The input data to the Stream Manager that will determine the nature of the retrieved content are viewed as input feeds that can represent either a keyword, a user or a location. There are several sources where the input feeds can be read from (i.e. a configuration file or a database). In general the feeds that will be used as the input to the system are created by the `eu.socialsensor.sfc.streams.input.FeedCreator` interface, which is implemented accordingly to the source that the input data are read from. In the case of a configuration file used as input, input feeds are created with `eu.socialsensor.sfc.streams.input.FeedsCreatorImpl.ConfigFeedsCreator` class, whereas when the data are read from MongoDB, input feeds are created with `eu.socialsensor.sfc.streams.input.FeedsCreatorImpl.MongoFeedsCreator` class. It is important to note that, in contrast to the other six APIs, Twitter is able to collect content of no specific origin when no input is given.

To start collecting content from all or a subset of the above social networks, run the eu.socialsensor.sfc.streams.StreamCollector class. Firstly, this class reads a configuration file, that contains the information regarding the creadentials needed to establish a connection with each social network, as well as the mandatory fields for storing/reading data to/from the selected databases. After the configuration file is read, the retrieving process occurs as shown below :

  1. An instance of the eu.socialsensor.sfc.streams.management.StreamManager class is created. This class is responsible for managing all the streams (open,search and close a stream) :

       StreamsManager manager = new StreamsManager(config);
    
  2. Manager opens all the streams, reads the input feeds with FeedCreator and establishes the connection with the given creadentials

        manager.open();  
    
  3. Manager starts the retrieval process for all the streams. For the non-real time APIs, polling requests are performed periodically.

        manager.search();
    

Inside Stream Manager, each stream is handled as a different thread. Thus, each social network wrapper can be given different input feeds to track. Additionally, each feed is being tracked by a different thread in order to minimize time cost, except Twitter which is a subscriber.

The collected data from the retrieval process are stored as JSON items (representing status update, post etc.), media items (images, videos, albums), users and webpages (embedded in posts/statuses). The above can be stored in a MongoDB, Lucene or Solr database, which are currently supported. In addition to this, the collected data are used to create graphs, which currently model user -- retweets -- user and user -- mentions -- user relationship. The graphs are created with Titan and Neo4j graph databases. The storage process is handled by the eu.socialsensor.sfc.streams.management.StoreManager class.

Learning more

Stream Manager project is dependent to other three SocialSensor projects :

  1. Socialmedia-abstractions : The abstraction layer for mapping a set of different social networks' wrappers to a single representation.
  2. Socialsensor-framework-common : The project contain main classes and interfaces to be used by other SocialSensor projects
  3. Socialsensor-framework-client : The wrappers for handling information in/from the supported databases (MongoDB,Solr,Lucene).

Contact for further details about the project

Aikaterini Iliakopoulou ([email protected]), Manos Schinas ([email protected]), Symeon Papadopoulos ([email protected])

socialsensor-stream-manager's People

Contributors

kleinmind avatar manosetro avatar matzika avatar sarovios avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.