GithubHelp home page GithubHelp logo

sfr-data-enhance-manager's Introduction

ResearchNow Metadata Enhancement manager

This Lambda is a simple service that handles the data flow of the ResearchNow ingest pipeline. It reads records fed to it via multiple Kinesis streams and passes the resulting records to the proper step in the metadata enhancement/parsing pipeline. When complete it will pass the completed metadata block to the database for storage.

Installation/Deployment

  1. Clone repository and run npm install
  2. Copy .env.sample to .env and adjust necessary values (recommended settings are noted)
  3. install node-lambda globally with npm install -g node-lambda if it does not exist
  4. Copy the local/development/production.env.sample and set the appropriate values
  5. Copy the event_sources_*.env.sample files and set for the appropriate values
  6. Run the appropriate commands in package.json
    • npm run local-run will run the Lambda with values provided in an event.json file
    • npm run deploy-* will package and deploy the Lambda to the designated environment

Description

This uses a stage variable to designate the step of the enhancement process where to pass the current record:

  • NONE/new: Pass to the first stage in the metadata process (currently MetadataWrangler)
  • mw: Processed by MetadataWrangler, pass to the OCLC enhancer
  • oclc: Processed by MW and OCLC, mark as complete
  • complete: Finished, pass to database manager

TODO

  • Add additional Enhancement steps (LC, Getty, etc)
  • Add Kinesis/DLQ for failed records for re-processing

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.