GithubHelp home page GithubHelp logo

enformatik / covid-tracking Goto Github PK

View Code? Open in Web Editor NEW

This project forked from covid19tracking/covid-tracking

0.0 0.0 0.0 234 KB

urlwatch configuration to monitor state COVID-19 data

License: Apache License 2.0

Python 100.00%

covid-tracking's Introduction

As of March 7, 2021 we are no longer collecting new dataLearn about available federal data.


covid-tracking

Tracking state test counts. Posting diff changes every ~30mins to #urlwatch in COVID-Tracking.

Installation

  • To install the fork of urlwatch: pip install git+https://github.com/COVID19Tracking/urlwatch.git
    • You also need the tesseract-ocr package, for which you can find the installation instructions here
  • For development, you can also check out the COVID19Tracking/urlwatch repository and deploy from your local copy with pip3 install .

urls.yaml

This is a config file for urlwatch to detect changes to health department pages (see list here) and report them to a Slack channel for further analysis.

urlwatch is running this fork with patches to allow in-browser execution for the pages that need it and webhook reporting of changed pages for IFTTT integration.

Getting started with modifying urls.yaml filters

Note, this does not cover environment setup, but please ask in #coding if you have any issues/questions around getting dependencies setup and running. Lots of Python folks to help out!

Jump into the repo root directory: cd PATH/covid-tracking

Output the current list of watchers: urlwatch --urls urls.yaml --list

Test the filter you're interested in, using the number from the --list command: urlwatch --urls urls.yaml --test-filter FILTER_NUM

And now the fun part:

  • Open existing or new URL (sometimes takes manual searching/research around state health websites, Slack channels, etc.)

  • Open Developer Tools (cmd+opt+i in Mac/Chrome)

  • Investigate where test and/or case data is being outputted, check for higher-level CSS classes and/or elements/data structures to target

  • Update filter using css:CSS_RULES, xpath:XPATH_RULES, or other available filters defined here: urlwatch/filters. The parent repo is also a great resource: github.com/thp/urlwatch

  • Make sure to add ,html2text on the end of your filter to clean up the output if using a css or xpath filter to keep things readable in the #urlwatch channel :)

  • Re-run urlwatch --urls urls.yaml --test-filter FILTER_NUM to test/confirm your new rules are working.

  • Ask questions!

  • When ready, git push to master and let Josh Ellington and/or Zach Lipton know new rules are ready to be deployed.

Most important, keep an eye on #urlwatch! These pages are changing/going down/implementing new rules daily at this point. Responding quickly will keep everyone informed and updated on these data changes.

TODO

  • fine tune filters to narrow in on the correct data
  • for those states that have structured data, automatically parse out data fields and detect changes

covid-tracking's People

Contributors

zachlipton avatar joshellington avatar gilmourj avatar smike avatar jyouyang avatar nodots avatar joshuaellinger avatar hammer avatar kevee avatar olivierlacan avatar jasonlcrane avatar space-buzzer avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.