BAS Style Kit File Upload Endpoint

A minimal API implementing a simple form action for testing file upload components in the BAS Style Kit.

This API is also used for testing generic features used across BAS APIs. See the General features section for more information.

Purpose

The BAS Style Kit includes interactive components for file uploads, to help develop these components, and to simulate various error states, a real service which will accept (or decline) uploads was needed to give realistic results.

This API provides a set of endpoints which can be used as form actions to test different situations including single/multiple uploads and errors such as uploads that are too large. This API is intentionally separate to the Style Kit to test cross origin requests, which are restricted by Web Browsers for security.

Note: This API is not intended, and does not, store uploaded files. It is not intended for any use beyond testing or demonstrating file upload components in the Style Kit.

Generic features

In addition to providing a file upload endpoint, this API is used for developing features that are common or generic in nature for use by other BAS APIs. Some of these features are generic, or can demonstrate generic features, such as using [Request IDs][#request-ids]. Others are features specific to developing APIs using Flask or it's wider ecosystem.

This API is used rather than developing a synthetic 'demonstrator' application and means that effort is not wasted on an API that does nothing useful by itself. Consequently this API may seem 'over the top' for what it does, which is after all, quite simple. Some features, such as database support, are out of scope however, as it wouldn't make sense to add based on what this API does.

Usage

See the Usage information for the various endpoints available and general information about this API.

Implementation

This API is implemented as a minimal Python Flask application. No information is persisted by this API, however logs may be captured depending on how the API is deployed or for Error tracking.

Configuration

Application configuration is set within config.py. Options use global or per-environment defaults which can be overridden if needed using environment variables, or a .env file (Dot ENV) file.

Options include values such as application secrets and feature flags, used to enable to disable features such as error logging.

The application environment is set using the FLASK_ENV environment variable. A sample Dot ENV file, .env.example, describes how to set any required, or frequently changed options. See config.py for all available options.

Request IDs

To aid in debugging, all requests will include a X-Request-ID header with one or more values. This can be used to trace requests through different services such as a load balancer, cache and other layers. Request IDs are managed by the Request ID middleware. The X-Request-ID header is returned to users and other components as a response header.

See the Correlation ID documentation for how the BAS API Load Balancer handles Request IDs.

If there isn't a X-Request-ID header, one will be added by this API.

To access the Request ID within the application:

from flask import request

print(request.environ.get("HTTP_X_REQUEST_ID"))

Error tracking

To ensure the reliability of this API, errors are logged to Sentry for investigation and analysis.

Through Continuous Deployment, each commit to the master branch in the project repository creates a new Sentry release, associated the production environment through a deployment using the Sentry CLI.

Health checks

Endpoints are available to allow the health of this API to be monitored. This can be used by load balancers to avoid unhealthy instances or monitoring reporting tools to prompt repairs by operators.

[GET|OPTIONS] `/meta/health/canary`

Reports on the overall health of this service as a boolean healthy/unhealthy status.

Returns a 204 - NO CONTENT response when healthy. Any other response should be considered unhealthy.

Setup

Local development

To setup a local copy of this API access to this repository, Docker and Docker Compose are required.

$ cd bas-style-kit-file-upload-endpoint

If you have access to the BAS GitLab instance, you can pull the application Docker image from the BAS Docker Registry. Otherwise you will need to build the Docker image locally.

# If you have access to gitlab.data.bas.ac.uk
$ docker login docker-registry.data.bas.ac.uk
$ docker-compose pull
# If you don't have access
$ docker-compose build

Copy .env.example to .env and edit the file to set at least any required (uncommented) options.

To run the API using the Flask development server (which reloads automatically if source files are changed):

$ docker-compose up

To run commands against the Flask application (such as Integration tests):

$ docker-compose run app flask [command]
# E.g.
$ docker-compose run app flask test

Heroku

To setup the Heroku project for this application, access to this repository, Heroku and Terraform is required.

Note: Make sure the HEROKU_API_KEY and HEROKU_EMAIL environment variables are set within your local shell.

$ cd bas-style-kit-file-upload-endpoint
$ cd provisioning/terraform
$ docker-compose run terraform
$ terraform init
$ terraform apply

Visit the Heroku project settings to set Config Vars (environment variables) for sensitive settings:

Config Var	Config Value	Description
`SENTRY_DSN`	Available from Sentry	Identifier for application in Sentry error tracking

Development

This API is developed as a Flask application using the conventions outlined in the Flasky example project [1].

Environments and feature flags are used to control which elements of this application are enabled in different situations. For example in the development environment, Sentry error tracking is disabled and Flask's debug mode is on.

New features should be implemented with appropriate Configuration options available. Sensible defaults for each environment, and if needed feature flags, should allow end-users to fine tune which features are enabled.

Ensure .env.example is kept up-to-date if any configuration options are added or changed.

Ensure Integration tests are written for any new feature, or changes to existing features.

Ensure the end user usage documentation is kept up-to-date as API methods, options, etc. change.

[1] If in BAS, access to the associated book is available from the Web & Applications Team.

Code Style

PEP-8 style and formatting guidelines must be used for this project, with the exception of the 80 character line limit.

Flake8 is used to ensure compliance, and is ran on each commit through Continuous Integration.

To check compliance locally:

$ docker-compose run app flake8 . --ignore=E501

Dependencies

Python dependencies should be defined using Pip through the requirements.txt file. The Docker image is configured to install these dependencies into the application image for consistency across different environments. Dependencies should be periodically reviewed and update as new versions are released.

To add a new dependency:

$ docker-compose run app ash
$ pip install [dependency]==
# this will display a list of available versions, add the latest to `requirements.txt`
$ exit
$ docker-compose down
$ docker-compose build

If you have access to the BAS GitLab instance, push the Docker image to the BAS Docker Registry:

$ docker login docker-registry.data.bas.ac.uk
$ docker-compose push

Dependency vulnerability scanning

To ensure the security of this API, all dependencies are checked against Snyk for vulnerabilities.

Warning: Snyk relies on known vulnerabilities and can't check for issues that are not in it's database. As with all security tools, Snyk is an aid for spotting common mistakes, not a guarantee of secure code.

Some vunerabilities have been ignored in this project, see .snyk for definitions and the Dependency exceptions section for more information.

Through Continuous Integration, on each commit current dependencies are tested and a snapshot uploaded to Snyk. This snapshot is then monitored for vulnerabilities.

Dependency vulnerability exceptions

This project contains known vulnerabilities that have been ignored for a specific reason.

Py-Yaml yaml.load() function allows Arbitrary Code Execution
- currently no known or planned resolution
- indirect dependency, required through the bandit package
- severity is rated high
- risk judged to be low given the nature of this API
- ignored for 1 year for re-review

Static security scanning

To ensure the security of this API, source code is checked against Bandit for issues such as not sanitising user inputs or using weak cryptography.

Warning: Bandit is a static analysis tool and can't check for issues that are only be detectable when running the application. As with all security tools, Bandit is an aid for spotting common mistakes, not a guarantee of secure code.

Through Continuous Integration, each commit is tested.

Logging

In a request context, the default Flask log will include the URL and Request ID of the current request. In other cases, these fields are substituted with NA.

Note: When not running in Flask Debug mode, only messages with a severity of warning of higher will be logged.

Request validation

All user inputs MUST be validated and sanitised as needed. Where possible enumerated options should be used over free-text choices.

The Cerberus library is used for validation, with schemas defined for each method. Where validation fails, the request should be aborted as a bad request, passing the validation object and schema to the (meta.errors.)error_request_validation() function for creating API errors.

To make these errors more relevant, the validation schema is extended to include these extra properties (per field):

request_type (required)

Cerberus does not allow the validation schema to be extended so a compatible version needs to be created using the (meta.utils.)get_cerberus_schema function.

For example, to validate a method (foo), with a single request parameter (bar), which accepts a controlled list of values (apple or orange):

@blueprint.route('/foo/<bar>')
def foo(bar: str):
    """
    Example request method

    :type bar: str
    :param bar: example request parameter
    """

    # Validate request
    foo_schema = {
        'bar': {
            'type': 'string',
            'request_type': 'parameter',
            'required': True,
            'allowed': ['apple', 'orange']
        }
    }
    foo_document = {'bar': bar}
    validator = Validator(get_cerberus_schema(foo_schema))
    if not validator.validate(foo_document):
        payload = {'errors': error_request_validation(validator, foo_schema)}
        abort(make_response(jsonify(payload), HTTPStatus.BAD_REQUEST))

    # Rest of method
    pass

`request_type`

Single string value representing where a field appears in a request, for example as a URL parameter or header.

Values are restricted to one of:

Request Type	Value	Description
Parameter	`request_type`	Use for URL parameters

Reserved routes

These routes are reserved and MUST NOT be implemented:

/meta/errors/generic-not-found - used to test the behaviour of a genuine 'not found' URL

Internal request methods

Some additional API endpoints are available for development/testing purposes. These endpoints are not documented publicly and should not be relied upon outside of development or testing.

[GET] `/meta/errors/generic-bad-request`

Triggers a generic 400 - Bad Request error to test the configured error handler. Returns a JSON formatted error.

[GET] `/meta/errors/generic-internal-server-error`

Triggers a generic 500 - Internal Server Error error to test the configured error handler. Returns a JSON formatted error.

[POST] `/meta/logging/entries/{logging_level}`

Triggers a hard-coded message to be logged to Application log at the severity given by the {logging_level} request parameter. This method does not return a response directly.

The {logging_level} parameter accepts any logging level, i.e. debug, info, warning, error, critical.

Note: When not running in Flask Debug mode, only messages with a severity of warning of higher will be logged.

Testing

Integration tests

This project uses integration tests to ensure features work as expected and to guard against regressions and vulnerabilities.

The inbuilt Python UnitTest library is used for running tests using Flask's test framework. Test cases are defined in files within tests/ and are automatically loaded when using the test custom Flask CLI command.

To run tests manually:

$ docker-compose run -e FLASK_ENV=testing app flask test

Pip dependencies are checked on each commit and then monitored for future vulnerabilities.

Tests are automatically ran on each commit through Continuous Integration.

Continuous Integration

All commits will trigger a Continuous Integration process using GitLab's CI/CD platform, configured in .gitlab-ci.yml.

This process will run the application Integration tests.

Review apps

To allow unreleased changes to be viewed by others, short-term instances of this API (known as review apps) are created on each commit within a branch, except the master branch, which is part of Deployment). Review apps are managed by GitLab as part of Continuous Integration.

Each review app is a standalone Heroku application, configured in the same way as the production instance. The URL for each application is computed based on a common prefix (bas-ra), the project (in GitLab) and CI/CD pipeline being ran. See the Herok section for more information on how this project runs on Heroku.

Deployment

Heroku

This API is deployed on Heroku as a app under the [email protected] shared account, using their container hosting option.

The Heroku project uses a Docker image built from the application image with the application source included and development related features disabled. This image is built and pushed to Heroku on each commit to the master branch through Continuous Deployment.

Note: This deployment is considered both a staging and production environment due to the low value and developer oriented nature of this API. The development environment is not deployed as it is only intended for local use.

Continuous Deployment

All commits to the master branch will trigger a Continuous Deployment process using GitLab's CI/CD platform, configured in .gitlab-ci.yml.

This process will build a Heroku specific Docker image using a 'Docker In Docker' (DIND/DND) runner and push this image to Heroku.

It will also create a Sentry release and associated deployment of the release to a production environment.

Release procedure

At release

create a release branch
remove -develop from version string in:

docker-compose.yml - app Docker image
.gitlab-ci.yml - default Docker image
Dockerfile.heroku - FROM directive

build & push the Docker image
close release in CHANGELOG.md
push changes, merge the release branch into master and tag with version
delete the -develop tag from the project container registry

The application will be automatically deployed into production using Continuous Deployment.

After release

create a next-release branch
bump the version with -develop appended in:

docker-compose.yml - app Docker image
.gitlab-ci.yml - default Docker image
Dockerfile.heroku - FROM directive

build & push the Docker image
push changes and merge the next-release branch into master

The application will be automatically deployed into production using Continuous Deployment.

Feedback

The maintainer of this project is the BAS Web & Applications Team, they can be contacted at: [email protected].

Issue tracking

This project uses issue tracking, see the Issue tracker for more information.

Note: Read & write access to this issue tracker is restricted. Contact the project maintainer to request access.

License

You may use and re-use this software and associated documentation files free of charge in any format or medium, under the terms of the Open Government Licence v3.0.

You may obtain a copy of the Open Government Licence at http://www.nationalarchives.gov.uk/doc/open-government-licence/

isabella232 / bas-style-kit-file-upload-endpoint Goto Github PK