GithubHelp home page GithubHelp logo

re-data's Introduction

re-data

This project is the ongoing effort of providing an simple JSON API interface for conferences and under heavy development.

Example data includes re:publica, 30C3, 31C3. Take a look at the scrapers directory.

Documentation

Documentation on the API can be found here

Contributing

The infrastructure is still very rough as this is a side project of several people. If you are interested in helping out, send pull requests or join the mailing list.

Examples

Set it up locally

  • you need a CouchDB instance (use for example a docker container to set on up easily)
    • docker run -d -p 5984:5984 fedora/couchdb
    • curl -X PUT http://localhost:5984/_config/admins/user -d '"secret"' (creates user user with password secret)
  • copy config.js.dist to config.js and fill in the credentials for the CouchDB instance (see curl step above)
  • copy scraper/config/scrapers.js.example to scraper/config/scrapers.js (default config is fine for a first run)
  • fetch dependencies via npm (needs to be executed in scraper subdirectory):
    • npm install
  • run the resetDB command inside the scraper subdirectory:
    • NODE_PATH=node_modules node scraper.js resetDB (NODE_PATH just specifies not globally install locations - was created by the npm install step)
  • run the import command inside the scraper subdirectory:
    • NODE_PATH=node_modules node scraper.js import

re-data's People

Contributors

toto avatar michaelkreil avatar yetzt avatar ffalt avatar morrisjobke avatar davidc avatar astro avatar

Stargazers

Oliver avatar Mohamed Saher avatar Andreas Härpfer avatar Matej yangwao avatar Volker Oertel avatar John Vandenberg avatar Chanu De Silva avatar Jörg Bühmann avatar Angus H. avatar jon ⚝ avatar Ida avatar Tobias Bradtke avatar Hans-Helge Buerger avatar Karl Bode avatar Jelena Gregorczyk avatar Pierre Haufe avatar Hans Ferchland avatar Felix Fichte avatar Leopold Talirz avatar Marcus André avatar  avatar Matías Agustín Méndez avatar  avatar Felix Erkinger avatar blazr avatar Alexander avatar Open Data Coder avatar Christian Pier avatar

Watchers

 avatar John Vandenberg avatar Andreas Hubel avatar  avatar James Cloos avatar  avatar  avatar Volker Oertel avatar Alexander avatar

re-data's Issues

Problem setting it up

$ NODE_PATH=./node_modules node scraper.js import

module.js:340
    throw err;
          ^
Error: Cannot find module './31C3/scraper.js'
    at Function.Module._resolveFilename (module.js:338:15)
    at Function.Module._load (module.js:280:25)
    at Module.require (module.js:364:17)
    at require (module.js:380:17)
    at Object.<anonymous> (/home/mjob/Projekte/re-data/scraper/config/scrapers.js:5:19)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Module.require (module.js:364:17)

My ./config.js:

#!/usr/bin/env node

/* configuration */
module.exports = {

    /* listen on tcp or socket */
    app: {
        host: 'localhost',
        port: 9999
    },

    /* couchdb configuration */
    db: {
        database: 'rp-data',
        host: 'localhost',
        port: 5984,
        options: {
            secure: false,
            auth: {
                username: '',
                password: ''
            }
        }
    },

    /* version */
    version: 0.1

};

my scrapers/config/scraper.js

exports.scrapers = {
    // { module:require('./rp13/scraper.js'), db:true },
    // { module:require('./rp14/scraper.js'), db:true }
    // { module:require('./altconf14/scraper.js'), db:false }
    '30C3': { module:require('./31C3/scraper.js'), db:true }
}

cc @toto - feel free to ping me on Twitter

Move specification to a separate repository/organisation

At the moment the specification looks like an api documentation for the tools hosted in the same repository.

To get the specification used be other developers/users it should be moved to a separate repository/organisation. It might also be a good idea to host the latest specification on a github page.

The "TLS 1.3 Draft" is an example for a specification hosted on github.

Don't fuck up database, when source is down!

Da brauchen wir ne Lösung, dass wir ne History der Daten anlegen und zu einem älteren Stand reverten können.

Szenario: re-publica.de-Server ist down und unser Scraper zerschießt unsere Datenbank!

couchdb import

in scraper/lib/db.js kommen die gescrapten Daten an und müssen in der couch-db aktualisiert werden. Unter importer/importer.js ist der alte Code. Im Idealfall werden die alten DB-Einträge nicht weggeworfen, sondern nur aktualisiert.

Find a name for the specification

The specification should get a name. It is much easier to talk/write about something if it has a name.

Something like "OEDF" - "Open Event Data Format"

Paginierung für API

Für die Übersichten /speakers und /sessions wäre wohl eine Paginierung sinnvoll. Default 20, max 100 oder was immer ihr für richtig haltet. Klar, die Ausgabe sollte ohnehin lokal gecached werden, aber wenn da 300 Sessions und ebensoviele Speaker komplett mit allen Details ausgegeben werden macht das gut Traffic, ganz davon ab dass die dann auch lokal erst verarbeitet werden müssten.

Planning mode

One thing to consider is, if we want to allow sessions without time/date in sort of a planning mode.

This would enable a few use cases we have not covered and is typical for all conferences I know of:

  • You could do something like halfnarp with re-data
  • For an app I would imagine that it would go into a "planning mode" where you can pre-pick your fav sessions before the final timeslots are selected.

Technically I would make two adjustments:

  • Make day and begin, end optional for sessions
  • Add some kind of state to the event so that an API consumer can easily tell that state the conference is in and whether he supports it (not all apps make sense in planning mode, some only there).

What do you say?

base_url for 31c3?

I was just told about this project and now I am very curious: where's the data for this conference?

check REST API

Import, Datenbank und Dokumentation sind vollständig und laufen. Jetzt müssen wir nur noch die REST API überprüfen.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.