GithubHelp home page GithubHelp logo

aufschreib's Introduction

aufschreib

##Purpose

This is my playground for learning to use node js as a server and build svg charts with d3. It's not tested on production server, so be warned.

What it should do:

a) A node js http server for classifying tweets by humans in a browser.

b) With these classifications train a Bayesian classifier.

c) Produce some statistical representations based on it.

And why?

In January/February 2013 (mostly german speaking) women on Twitter started to post personal stories of experienced sexism and harassment under the hashtag #aufschrei (#outcry). Soon many more people where using the hashtag to post opinions, links, troll comments, spam. I want to analyze this large amount of tweets and maybe contribute the results (if any usefull) to the aufschreiStat project.

##Current state in words

The result data is NOT RELIABLE! yet.

TODO-List

##Current state in pictures

See yourself

##Requirements

http://nodejs.org/ http://bower.io/ http://www.mongodb.org/

run npm install in root folder to install the required node.js-packages run bower install in /static/ to install the required client side js-packages

##Usage

0. Prepare Config File

Copy "config.dist.js" and rename it to "config.js"

now change the connection details to your settings

const mongo_settings = {
	"hostname": "localhost",
	"port": 27017,
	"username": "aufschreib",
	"password": "ohsosecret",
	"name": "aufschreib",
	"db": "aufschreib"
};

2. Prepare

Put your base JSON file named "tweets.json" into the /data/ folder

used format of a tweet must be the same twitter uses

[
{
	"created_at": "Thu, 31 Jan 2013 18:22:47 +0000",
	"id_str": "297047589672343473",
	"source": "<a href="http://client.url/">Client</a>",
	"text": "Some Tweet text with #hashtags, @usernames and http/https-links",
	"user": {
		"profile_image_url": "http://a0.twimg.com/profile_images/nr/some.png",
		"screen_name": "TwitterUser"
	}
},
...
]

or implement another in file "prepare.js"

3. Edit Categories (optional)

Edit "consts.js"

const cats = [
{
	id: 'outcry',
	name: 'Aufschrei',
	icon: 'icon-bullhorn',
	color: '#5e8c6A'
},
...
];    

if you edit the categories you need to set the parameter for the Bayesian filter, too.

'Specify the classification thresholds for each category. To classify an item in a category with a threshold of x the probably that item is in the category has to be more than x times the probability that it's in any other category. Default value is 1.' Source

const thresholds = {
	spam: 3,
	troll: 2,
	report: 2,
	comment: 1,
	outcry: 1
};

3. Longify Urls (optional)

run in \bin

node "longifyurls.js"

expand twitters short urls (t.co) through http://www.longurlplease.com/ expanded urls will then be checked for other short urls services, too.

a file "urls.json" with the expanded urls will be created and used

4. Prepare Script (mandatory!)

run in \bin

node "prepare.js"

collections are created and data is filled aaaaaaandddddd wait until the process finishes

5. Server

We're nearly there

Edit "config.js" if you want to change where to access the server

const server_settings = {
	listento: '0.0.0.0',
	port: 8081
};

now run

node "app.js"

and open the adress with your browser

e.g. http://localhost:8081/

default username is: admin

password is: totalsupergehaim

Happy classifing!

aufschreib's People

Contributors

ffalt avatar yetzt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

yetzt

aufschreib's Issues

TypeError in "1 prepare.js" with usedb = true

I followed your setup guide and executed node "1 prepare.js" on step 4.

On the first run data/messages.json contained only your example (without semicolon and ellipsis of course), the second time data/messages.json was filled with random tweets via https://stream.twitter.com/1.1/statuses/filter.json. Both times the same error occurred (see console output)

The database contains a table tweets with all tweets from data/messages.json, but the site says

Nix :]

Running with usedb = false, the site shows all tweets and but doesn't work (that would be the another issue/fix).

Could you provide a sample data/messages.json that works for you with mysql?

Console Output:

    […]
[Tweets] Connecting to DB
[DB] Preparing Tweets
[DB] Creating Tables
[DB] Tweet Table Created
[DB] VoteUsers Table Created
[DB] Votes Table Created
[DB] Pumping Tweets to DB
[DB] Tweets stored

TypeError: undefined is not a function
    at /Users/ckintner/aufschreib/1 prepare.js:55:4
    at Query._callback (/Users/ckintner/aufschreib/tweets_mysql.js:139:4)
    at Query.Sequence.end (/Users/ckintner/aufschreib/node_modules/mysql/lib/protocol/sequences/Sequence.js:66:24)
    at Query._handleFinalResultPacket (/Users/ckintner/aufschreib/node_modules/mysql/lib/protocol/sequences/Query.js:139:8)
    at Query.OkPacket (/Users/ckintner/aufschreib/node_modules/mysql/lib/protocol/sequences/Query.js:73:10)
    at Protocol._parsePacket (/Users/ckintner/aufschreib/node_modules/mysql/lib/protocol/Protocol.js:169:24)
    at Parser.write (/Users/ckintner/aufschreib/node_modules/mysql/lib/protocol/Parser.js:62:12)
    at Protocol.write (/Users/ckintner/aufschreib/node_modules/mysql/lib/protocol/Protocol.js:36:16)
    at write (_stream_readable.js:547:24)
    at flow (_stream_readable.js:556:7)

Additional information

I'm running

  • Mac OS X 10.8.2
  • node 0.10.0
  • MySQL Ver 5.6.10 for osx10.7 on x86_64 (MySQL Community Server (GPL))
  • mysql (node driver) 2.0.0-alpha7

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.