GithubHelp home page GithubHelp logo

thurler / reddit-tree-model Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 1.0 38 KB

Final project for complex networks class at UFRJ. The proposal is to use Kaggle's Reddit dataset to bring forth a model describing Reddit comment trees.

License: GNU General Public License v3.0

Python 100.00%

reddit-tree-model's Introduction

Presentation with the idea and results

reddit-tree-model

Final project for complex networks class at UFRJ. The proposal is to use Kaggle's Reddit dataset to bring forth a model describing Reddit comment trees.

R(t,p) model

A simple model inspired in thr BA model for the representation of reddit comment threads.

Parameters:

  • t: maximum number of iterations
  • p: the probability function that returns the probability that the current comment will receive a reply.

Initialization:

Starts with a graph containing only one vertex, the tree root (first comment in the thread). A cursor is pointing to this vertex, setting it as the 'current vertex'.

Step:

For each iteration (from 1 to t):

  • Decides with probability p if the current vertex (vertex where the cursor is at) will receive a reply.
    • If it does, then a vertex is added to the graph connecting the current vertex with the new one, directed from the current to the new. Then the cursor goes back to the tree root (first comment in the thread) and the iteration ends, going back to the first setp.
    • If doesn't receive a reply, then the cursor goes down a level of the tree.
      • If the current vertex has no neighbors, then the cursor goes back to the tree root (first comment in the thread) and the iteration ends, going back to the first setp.
      • Else, the next vertex is chosen in the next level with probability proportional to it's in-degree, so that the more neighbors the vertex has, the more it is preferred, creating a PA dynamic. The current vertex is set to the selected vertex. Goes back to the first step (deciding for the current vertex if there will be a reply) - the iteration does not end here.

Variations

There are a few variations on the basic step that can and have been implemented in this project. They will be documented here (and not only in the code itself) as soon as we have time for it.

Model simulator

Single thread - rp-simulator.py

Creates a single graph file AT --out_dir/allgraphs. Also plot the graph at --out_dir/plot if --draw is set. Other parameters seen using --help.

Example using mostly default values

python rp-simulator.py --p 0.01 --out_dir out

Batch process - run-simulation.py

Runs the process (rp-simulator.py) for each configuration --runs times. At the end, gets all the files at --out_dir/allgraphs and, for each detected configuration, concatenates all the graphs in a single file, saving a graph file containing all threads at --out_dir/finalgraphs

Warning: the process concatenates all the files found at --out_dir/allgraphs, not only those created in the current execution.

Example running 5 times for ten distinct configurations, drawing each thread at --out_dir/plot/:

python run-simulation.py --out_dir out --runs 5 --draw_each True --p_min 0.01 --p_max 0.1 --p_step 0.01

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.