GithubHelp home page GithubHelp logo

b97pla / snpseq-archive-verify Goto Github PK

View Code? Open in Web Editor NEW

This project forked from molmed/snpseq-archive-verify

0.0 1.0 0.0 76 KB

REST service for verifying uploaded archives

License: MIT License

Python 94.25% Dockerfile 3.03% Shell 2.72%

snpseq-archive-verify's Introduction

SNPSEQ Archive Verify

A self contained (aiohttp) REST service that helps verify uploaded SNP&SEQ archives by first downloading the archive from PDC, and then compare the MD5 sums for all associated files. The downloaded files are deleted on successful verification, and retained if any error occurs.

The service is composed of 3 components which must all be running and the system must be set up to allow the services to communicate over the configured protocols and ports (refer to the redis documentation for details):

  • archive-verify-ws REST service
  • Redis queue server
  • RQ worker

The web service enqueues certain job functions in the RQ/Redis queue, where they get picked up by the separate RQ worker process.

Pre-requisites

You will need python >=3.9 and redis.

Download and install redis

Install

It is recommended to set up the service in a virtual environment. venv is used below with bash on a Linux system.

python3 -m venv --upgrade-deps .venv
source .venv/bin/activate
pip install .

Running the service

Start the Redis server and RQ worker:

redis-server
rq worker

Start the REST service

archive-verify-ws -c=config/

Mock Downloading

If you are running this service locally and don't have IBM's dsmc client installed, you can skip the downloading step and verify an archive that is already on your machine.

To use this method:

  • copy an archive that has been pre-downloaded from PDC into the verify_root_dir set in app.yaml

  • Delete or edit some files from the archive if you wish to trigger a validation error.

  • in app.yml, set:

    pdc_client: "MockPdcClient"

Note that the archive will be deleted from verify_root_dir on successful verification

Naming Conventions for Mock Download

Note that when an archive is downloaded from PDC using snpseq-archive-verify, the downloaded directory is formatted with the name of the archive plus the RQ job id, like so:

{verify_root_dir}/{archive_name}_{rq_job_id}

When mocking downloading, we search verify_root_dir for archive_name and use the first directory found, ignoring the rq_job_id.

Running tests

source .venv/bin/activate
pip install -e .[test]
nosetests tests/

REST endpoints

Enqueue a verification job of a specific archive:

curl -i -X "POST" -d '{"host": "my-host", "description": "my-descr", "archive": "my_001XBC_archive"}' http://localhost:8989/api/1.0/verify

Enqueue a download job of a specific archive:

curl -i -X "POST" -d '{"host": "my-host", "description": "my-descr", "archive": "my_001XBC_archive"}' http://localhost:8989/api/1.0/download

Check the current status of an enqueued job:

curl -i -X "GET" http://localhost:8989/api/1.0/status/<job-uuid-returned-from-verify-endpoint>

Docker container

For testing purposes, you can also build a Docker container using the resources in the docker/ folder:

# build and start Docker container
docker/up

This will build and start a Docker container that runs a nginx proxy server which will listen to connections on ports 9898 and 9899 and forward traffic to the archive-verify service running internally. API calls to port 9898 are done as described above, e.g.:

# interact with archive-verify service on port 9898
curl 127.0.0.1:9898/api/1.0/status/1

API calls to port 9899 emulate how calls to the service running on Uppmax is done (i.e., going through a gateway). The first path element for these calls should be verify/, e.g.:

# interact with archive-verify service on port 9899
curl 127.0.0.1:9899/verify/api/1.0/status/1

The container log output can be followed:

# follow the container log output (Ctrl+C to stop)
docker/log

In addition, the archive-verify service in the container is running with the MockPdcClient enabled. In the container, there are two folders that can be used for testing with the mock client, test_1_archive and test_2_archive, e.g.:

# enque a download job of the test archive available in the container
curl \
  -X "POST" \
  -d '{"host": "test_host", "description": "my-descr", "archive": "test_1_archive"}' \
  http://localhost:9898/api/1.0/download

  # {
  #   "status": "pending",
  #   "job_id": "d7a26f2e-d410-4a9b-a308-973821d0a021",
  #   "link": "http://localhost:9898/api/1.0/status/d7a26f2e-d410-4a9b-a308-973821d0a021",
  #   "path": "data/test_host/runfolders/test_1_archive",
  #   "action": "download"
  # }

# check the status of the download job
curl \
  http://localhost:9898/api/1.0/status/d7a26f2e-d410-4a9b-a308-973821d0a021

  # {
  #   "state": "done",
  #   "msg": "Job d7a26f2e-d410-4a9b-a308-973821d0a021 has returned with result: Successfully verified archive md5sums."
  # }

# enque a verify job of the test archive available in the container, emulating a call to the service on Uppmax
curl \
  -X "POST" \
  -d '{"host": "test_host", "description": "my-descr", "archive": "test_2_archive"}' \
  http://localhost:9899/verify/api/1.0/verify

  # {
  #   "status": "pending",
  #   "job_id": "0124bdcb-7f31-4402-9251-ae766306ad49",
  #   "link": "http://localhost:9899/api/1.0/status/0124bdcb-7f31-4402-9251-ae766306ad49",
  #   "path": "data/test_host/runfolders/test_2_archive",
  #   "action": "verify"
  # }

# check the status of the download job
curl \
  http://localhost:9899/verify/api/1.0/status/0124bdcb-7f31-4402-9251-ae766306ad49

  # {
  #   "state": "done",
  #   "msg": "Job 0124bdcb-7f31-4402-9251-ae766306ad49 has returned with result: Successfully verified archive md5sums."
  # }

The docker container can be stopped and removed:

# stop and remove the running docker container
docker/down

snpseq-archive-verify's People

Contributors

b97pla avatar johandahlberg avatar johanherman avatar mariya avatar matrulda avatar slohse avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.