GithubHelp home page GithubHelp logo

imclab / tweetmotif Goto Github PK

View Code? Open in Web Editor NEW

This project forked from brendano/tweetmotif

0.0 2.0 0.0 34.85 MB

Topical search for Twitter. See twokenize.py, emoticons.py for tokenization.

Home Page: tweetmotif.com

Python 89.65% TeX 3.40% Shell 0.12% Makefile 0.01% ApacheConf 0.01% CSS 1.18% JavaScript 2.90% HTML 2.15% Smarty 0.58%

tweetmotif's Introduction

TweetMotif

TweetMotif is a faceted/topic/summarizing search system for Twitter, built on top of the search.twitter.com API. http://tweetmotif.com

Do you just want the tokenizer?

All you need is two files:

If you use it in research, please cite:

  • Brendan O'Connor, Michel Krieger, and David Ahn. TweetMotif: Exploratory Search and Topic Summarization for Twitter. ICWSM-2010.

Latest version (Java)

The latest version, with a number of improvements, is in Java. We released a new version Sept. 2012. See the explanation and links at: http://www.ark.cs.cmu.edu/TweetNLP

More on TweetMotif

By Brendan O'Connor, Michel Krieger, and David Ahn. Written over April-May 2009 and released April 2010.

The TweetMotif paper (inside EXAMPLES_AND_WRITING, or a copy at this link) overviews the system.

Running TweetMotif

Prerequisites

  • Tokyo Cabinet
  • Tokyo Tyrant
  • mod_wsgi
  • Python: version 2.5 works

There are precompiled versions of the Tokyo infrastructure in platform/, for Mac OSX 10.5 and Ubuntu 8.04-ish. In the off-chance they will work for your system, uncomment the code that specifies to use them (grep platform *.py). You may also have to muck around with ld.so.conf.d and ldconfig (on Linux) to get mod_wsgi, which is inside Apache, to see them.

You also need to be running Tokyo Tyrant for the query cache. This is usually inconvenient for just getting started; in which case, disable it by commenting out the lines

# the_cache = ....
# @the_cache.wrap

In query_cache.py

Architecture

There is a backend and frontend. The backend talks to search.twitter.com and does all text processing, clustering, etc. The frontend is a Django web site with normal and iPhone versions.

The backend makes extensive use of Tokyo Cabinet and Tyrant databases: for the language model, and the query cache.

Both the backend and frontend are WSGI apps. Everything is set up to run through mod_wsgi. They communicate via JSON-over-HTTP.

Backend

The backend is run through, confusingly enough, frontend.py. It also has a primitive frontend for development purposes there.

Frontend

The frontend is Django. See djfrontend/.

License

TweetMotif is licensed under the Apache License 2.0: http://www.apache.org/licenses/LICENSE-2.0.html

Copyright Brendan O'Connor, Michel Krieger, and David Ahn, 2009-2010.

tweetmotif's People

Contributors

brendano avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.