Copyright 2013 NPR. All rights reserved. No part of these materials may be reproduced, modified, stored in a retrieval system, or retransmitted, in any form or by any means, electronic, mechanical or otherwise, without prior written permission from NPR.
(Want to use this code? Send an email to [email protected]!)
- What is this?
- Assumptions
- What's in here?
- Install requirements
- Project secrets
- Adding a template/view
- Run the project locally
- Editing workflow
- Run Javascript tests
- Run Python tests
- Compile static assets
- Test the rendered app
- Deploy to S3
- Update the data
A lobbying data explorer for the Missouri Legislature. This project is a collaboration betwee nprapps and St. Louis Public Radio.
The following things are assumed to be true in this documentation.
- You are running OSX.
- You are using Python 2.7. (Probably the version that came OSX.)
- You have virtualenv and virtualenvwrapper installed and working.
For more details on the technology stack used with the app-template, see our development environment blog post.
The project contains the following folders and important files:
confs
-- Server configuration files for nginx and uwsgi. Edit the templates thenfab <ENV> render_confs
, don't edit anything inconfs/rendered
directly.data
-- Data files, such as those used to generate HTML.etc
-- Miscellaneous scripts and metadata for project bootstrapping.jst
-- Javascript (Underscore.js) templates.less
-- LESS files, will be compiled to CSS and concatenated for deployment.templates
-- HTML (Jinja2) templates, to be compiled locally.tests
-- Python unit tests.www
-- Static and compiled assets to be deployed. (a.k.a. "the output")www/live-data
-- "Live" data deployed to S3 via cron jobs or other mechanisms. (Not deployed with the rest of the project.)www/test
-- Javascript tests and supporting files.app.py
-- A Flask app for rendering the project locally.app_config.py
-- Global project configuration for scripts, deployment, etc.copytext.py
-- Code supporting the Editing workflowcrontab
-- Cron jobs to be installed as part of the project.fabfile.py
-- Fabric commands automating setup and deployment.public_app.py
-- A Flask app for running server-side code.render_utils.py
-- Code supporting template rendering.requirements.txt
-- Python requirements.
Node.js is required for the static asset pipeline. If you don't already have it, get it like this:
brew install node
curl https://npmjs.org/install.sh | sh
Then install the project requirements:
cd stl-lobbying
npm install less universal-jst -g --prefix node_modules
mkvirtualenv --no-site-packages stl-lobbying
pip install -r requirements.txt
Project secrets should never be stored in app_config.py
or anywhere else in the repository. They will be leaked to the client if you do. Instead, always store passwords, keys, etc. in environment variables and document that they are needed here in the README.
A site can have any number of rendered templates (i.e. pages). Each will need a corresponding view. To create a new one:
- Add a template to the
templates
directory. Ensure it extends_base.html
. - Add a corresponding view function to
app.py
. Decorate it with a route to the page name, i.e.@app.route('/filename.html')
- By convention only views that end with
.html
and do not start with_
will automatically be rendered when you callfab render
.
A flask app is used to run the project locally. It will automatically recompile templates and assets on demand.
workon stl-lobbying
python app.py
Visit localhost:8000 in your browser.
The app is rigged up to Google Docs for a simple key/value store that provides an editing workflow.
View the sample copy spreadsheet here. A few things to note:
- If there is a column called
key
, there is expected to be a column calledvalue
and rows will be accessed in templates as key/value pairs - Rows may also be accessed in templates by row index using iterators (see below)
- You may have any number of worksheets
- This document must be "published to the web" using Google Docs' interface
This document is specified in app_config
with the variable COPY_GOOGLE_DOC_KEY
. To use your own spreadsheet, change this value to reflect your document's key (found in the Google Docs URL after &key=
).
The app template is outfitted with a few fab
utility functions that make pulling changes and updating your local data easy.
To update the latest document, simply run:
fab update_copy
Note: update_copy
runs automatically whenever fab render
is called.
At the template level, Jinja maintains a COPY
object that you can use to access your values in the templates. Using our example sheet, to use the byline
key in templates/index.html
:
{{ COPY.attribution.byline }}
More generally, you can access anything defined in your Google Doc like so:
{{ COPY.sheet_name.key_name }}
You may also access rows using iterators. In this case, the column headers of the spreadsheet become keys and the row cells values. For example:
{% for row in COPY.sheet_name %}
{{ row.column_one_header }}
{{ row.column_two_header }}
{% endfor %}
With the project running, visit localhost:8000/test/SpecRunner.html.
Python unit tests are stored in the tests
directory. Run them with fab tests
.
Compile LESS to CSS, compile javascript templates to Javascript and minify all assets:
workon stl-lobbying
fab render
(This is done automatically whenever you deploy to S3.)
If you want to test the app once you've rendered it out, just use the Python webserver:
cd www
python -m SimpleHTTPServer
fab staging master deploy
Updating other datasets
- The canonical representation of the legislators is the legislator demographics Google document. This document should only ever contain the current legislators. If a district is vacant you should include a row for it with the word
VACANT
in thelast_name
column. This will cause thevacant
flag to be set on the correctLegislator
database entry (other fields will be set to blank). - The canonical source for lobbying organization names and categories is the organization name lookup Google document. New organizations/organization misspellings should be added to this document.
Be sure to republish these spreadsheets by going to File
, Publish to the web...
and then clicking Republish now
. Otherwise they may still be cached when you run the loader.
Loading the data
fab local_bootstrap
This will fetch the two documents mentioned above and scrape the latest data from the Missouri website. Any warnings or errors will be printed to the console once the loader is finished.
Errors must be resolved before you complete the update process. If an error refers to an unknown organization name then it should be added to the organization name lookup Google document.
Warnings do not need to be resolved unless they indicate the source data is invalid. There are likely to be a small number of date errors each year. We can safely ignore these.
Rerun the loader until all errors have been successfully resolved. (If doing a lot of this you can just load recent data using fab local_bootstrap_sample
.
Test the site
python app.py
Deploy
Finally rerender and deploy the site to production:
fab production master deploy deploy_pages