GithubHelp home page GithubHelp logo

data_viz's Introduction

Exploring human knowledge through Wikipedia usage

“Exploring human knowledge through Wikipedia usage” is an unbiased data visualization app, that can be used to observe Wikipedia user search trends. It is part of the EPFL course "Data Visualization".

Demo

You can see our visualization in action, in the following URL: https://ividim.github.io/DataViz/

File Structure

  • assets: Contains the front-end part of the application.
  • data: Contains standalone data files that are used by the app.
  • server: Contains the back-end part of the application.
  • vendor: Contains external Javascript libraries, used in the front-end part of the app.

Authors

Hyun Jii Cho, Ivi Dimopoulou, Kirusanth Poopalasingam

Technical Details

How to import Wikipedia dumps

1. Download files

Download the "pagecounts-{}-{}-views-ge-5.bz2" files from : https://dumps.wikimedia.org/other/pagecounts-ez/merged/

2. Unzip

This command will unzip the file and remove the original file

bzip2 -d pagecounts-2016-10-views-ge-5.bz2

3. Keep only english articles

This command get english articles and write it in a separate file

grep '^en.z' pagecounts-2016-10-views-ge-5 > pagecounts-2016-10-views-ge-5-cleaned

4. Get peakday

Execute the wiki/run.py which will keep only articles with certain amount of views and write them in a separate file.

python run.py

data_viz's People

Contributors

kiru avatar justcho5 avatar ividim avatar

Watchers

James Cloos avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.