GithubHelp home page GithubHelp logo

wiese / cyclopol Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 2.34 MB

cyclopol – browse and visualize press releases

Shell 0.60% PHP 66.97% JavaScript 5.55% SCSS 0.17% Vue 26.33% Twig 0.37%
graphql symfony vue php apollo doctrine

cyclopol's Introduction

cyclopol – browse and visualize press releases

Screeshot of search results in a list as well as visualized on a map and timeline

Installation

Copy .env.example to .env and set appropriate values.

Commands

  • starting the services
    docker-compose up
    • db - persistence
    • app - data import pipeline and API
    • ssg - graphical front end
  • creating the database schema (keep in mind that the initial setup of the db container can take a long time before users are created)
    docker-compose exec app bin/console doctrine:schema:create
  • Running the data import pipeline
    docker-compose exec app bin/console app:workflow
  • updating the database schema (only needed after changing DataModels)
    docker-compose exec app bin/console doctrine:schema:update

External services

  • talks to the article source website during commands run on app

Data flow

Index => ArticleTeaser

  • link
  • listing date

Download => ArticleSource

  • HTML
  • listing date

Extract (from precisely defined parts of the page) => Article

  • title
  • date
  • text
  • district(s)

Derive (from the text, based on some fuzzy logic) => ArticleAddress

  • report id
  • previous report ids (if the article references former reports)
  • street names

Enrich => Coordinate

  • coordinates (from the street names)

Ideas

Problems

  • articles sometimes receive updates (e.g. pressemitteilung.885469.php), they are then listed again (a new ArticleTeaser with a new date) and their content should be downloaded again (new ArticleSource) - how does this translate to Article, though. Is only the latest ArticleSource taken into account when building the Article (and its subsequent information)? Do we somehow make its versions available?

cyclopol's People

Contributors

wiese avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.