GithubHelp home page GithubHelp logo

lyber-core's Introduction

CircleCI Test Coverage Maintainability Gem Version

lyber_core

Robot Creation

Create a class that subclasses LyberCore::Robot

  • In the initializer, call super with the workflow name, step name
  • Your class #perform_work method will perform the actual work; druid is available as an instance variable.
module Robots
  module DorRepo
    module Accession

      class Shelve < LyberCore::Robot

        def initialize
          super('accessionWF', 'shelve')
        end

        def perform_work
          cocina_object.shelve
        end

      end

    end
  end
end

By default, the druid will be set to the completed state, but you can optionally have it set to skipped by creating a ReturnState object as shown below. You can also return custom notes in this way

module Robots
  module DorRepo
    module Accession

      class Shelve < LyberCore::Robot

        def initialize
          super('accessionWF', 'shelve')
        end

        def perform
          if some_logic_here_to_determine_if_shelving_occurs
            cocina_object.shelve
            return LyberCore::ReturnState.new(status: 'completed') # set the final state to completed
#           return LyberCore::ReturnState.new(status: 'completed', note: 'some custom note to pass back to workflow') # set the final state to completed with a custom note

          else
            # just return skipped if we did nothing
            return LyberCore::ReturnState.new(status: 'skipped') # set the final state to skipped
#           return LyberCore::ReturnState.new(status: 'skipped', note: 'some custom note to pass back to workflow') # set the final state to skipped with a custom note
          end
        end

      end

    end
  end
end

Robot Environment Setup

Create a config/boot.rb containing:

require 'rubygems'
require 'bundler/setup'
Bundler.require(:default)

LyberCore::Boot.up(__dir__)

# Any additional robot-specific configuratio.

The configuration must include:

redis_url: ~

workflow:
  url: http://workflow.example.com/workflow
  logfile: 'log/workflow_service.log'
  shift_age: 'weekly'
  timeout: 60

And optionally:

# For Dor Services Client
dor_services:
  url:  'https://dor-services-test.stanford.test'
  token: secret-token

# For Cocina::Models::Mapping::Purl
purl_url: 'https://purl-example.stanford.edu'

# For DruidTools::Druid
stacks:
  local_workspace_root: ~

The following environment variables can optionally be set:

  • ROBOT_ENVIRONMENT
  • ROBOT_LOG_LEVEL

Robot Testing

Include the following in rspec/spec_helper.rb:

ENV['ROBOT_ENVIRONMENT'] = 'test'
require File.expand_path("#{__dir__}/../config/boot")

include LyberCore::Rspec

Robots can be invoked with:

test_perform(robot, druid)

to avoid the workflow updates in perform().

lyber-core's People

Contributors

wmene avatar sul-devops-team avatar mbklein avatar justinlittman avatar jcoyne avatar ndushay avatar cbeer avatar mjgiarlo avatar peetucket avatar jmartin-sul avatar aaron-collier avatar atz avatar lwrubel avatar edsu avatar bess avatar dazza-codes avatar alpanaststanford avatar tingulfsen avatar

Watchers

 avatar Jeremy Nelson avatar  avatar  avatar  avatar Benjamin Albritton avatar James Cloos avatar  avatar  avatar  avatar Nicholas Taylor avatar  avatar  avatar

Forkers

davidmcclure

lyber-core's Issues

Update to dor-services 5.x

This also entails updating all consumers of lyber-core to use the 5.x-based release (if they actually hit the backend in any way).

item_queued? is returning wrong result

When there is more than one version.

e.g.:

curl https://sul-lyberservices-test.stanford.edu/workflow/dor/objects/druid:hx908xy6904/workflows/assemblyWF
<workflow repository="dor" objectId="druid:hx908xy6904" id="assemblyWF">
  <process version="1" priority="0" note="" lifecycle="pipelined" laneId="default" elapsed="" attempts="0" datetime="2019-01-28T20:40:18+00:00" status="completed" name="start-assembly"/>
  <process version="1" priority="0" note="" lifecycle="" laneId="default" elapsed="" attempts="0" datetime="2019-01-28T20:40:18+00:00" status="skipped" name="jp2-create"/>
  <process version="1" priority="0" note="sul-robots1-test.stanford.edu" lifecycle="" laneId="default" elapsed="0.25" attempts="0" datetime="2019-01-28T20:40:18+00:00" status="completed" name="checksum-compute"/>
  <process version="1" priority="0" note="sul-robots1-test.stanford.edu" lifecycle="" laneId="default" elapsed="0.306" attempts="0" datetime="2019-01-28T20:40:18+00:00" status="completed" name="exif-collect"/>
  <process version="1" priority="0" note="sul-robots2-test.stanford.edu" lifecycle="" laneId="default" elapsed="0.736" attempts="0" datetime="2019-01-28T20:40:18+00:00" status="completed" name="accessioning-initiate"/>
  <process version="2" priority="0" note="" lifecycle="" laneId="default" elapsed="" attempts="0" datetime="2019-01-29T22:51:09+00:00" status="completed" name="start-assembly"/>
  <process version="2" priority="0" note="contentMetadata.xml exists" lifecycle="" laneId="default" elapsed="0.278" attempts="0" datetime="2019-01-29T22:51:09+00:00" status="skipped" name="content-metadata-create"/>
  <process version="2" priority="0" note="" lifecycle="" laneId="default" elapsed="0.0" attempts="0" datetime="2019-01-29T22:51:09+00:00" status="queued" name="jp2-create"/>
  <process version="2" priority="0" note="" lifecycle="" laneId="default" elapsed="0.0" attempts="0" datetime="2019-01-29T22:51:09+00:00" status="queued" name="checksum-compute"/>
  <process version="2" priority="0" note="" lifecycle="" laneId="default" elapsed="0.0" attempts="0" datetime="2019-01-29T22:51:09+00:00" status="queued" name="exif-collect"/>
  <process version="2" priority="0" note="" lifecycle="" laneId="default" elapsed="0.0" attempts="0" datetime="2019-01-29T22:51:09+00:00" status="queued" name="accessioning-initiate"/>
</workflow>

Caused by: sul-dlss/dor-workflow-client#56

support robots being able to set a skipped status

We want to provide support for a robot to set the skipped status. Currently, after the .perform method is done, lyber-core always sets the status to either completed or error.

the proposal is to make the default completed if used as is to keep things backwards compatible for all existing robots, but allow the robot to return an optional value to indicate the desired end state

To implement, we could allow the .perform method to return a Results class/struct that has the status (skipped or completed) that the framework should use on successful completion.

See https://github.com/sul-dlss/lyber-core/blob/master/lib/lyber_core/robot.rb#L77

LyberCore::Robot module includes `initialize` method, not modular

A module should not include an initialize method. The point of a module is modularity, i.e. methods that could be included in different classes and different types of classes. But if a module provides initialize then it is specifying fundamental class-defining behavior AND cannot be used with other such modules without order dependency and a foreknowledge of interdependency. That isn't modular.

LyberCore::Robot module includes an initialize method here:
https://github.com/sul-dlss/lyber-core/blob/master/lib/lyber_core/robot.rb#L51

Seems like it would be less of an anti-pattern to just be a base class.

logger does not output name of robot

When you run a robot manually, or configure them to output to stdout, the log doesn't report which robot is running, so you get something like:

 INFO [2014-12-03 15:41:29] (748)  :: bh152hk2665 processing
 INFO [2014-12-03 15:41:30] (748)  :: bh152hk2665 completed in 0.1384s

See https://github.com/sul-dlss/lyber-core/blob/master/lib/lyber_core/robot.rb#L74

But I'd like to know the name of the robot, like this:

 INFO [2014-12-03 15:41:29] (748)  :: robot-name :: bh152hk2665 processing
 INFO [2014-12-03 15:41:30] (748)  :: robot-name :: bh152hk2665 completed in 0.1384s

some exceptions not caught

Sometimes robot exceptions are not caught correctly by lybercore::work and they are sent to Resque to be put in the /failed queue....

Worker
sul-robots1-prod.stanford.edu:2219 on DOR_ACCESSIONWF_PUBLISH_DEFAULT at 6 minutes ago Retry or Remove
Class
 Robots::DorRepo::Accession::Publish
Arguments
--- druid:qf593jg6933
...
Exception
Dor::Describable::CrosswalkError
Error
Unknown descMetadata namespace: nil
/home/lyberadmin/common-accessioning/shared/bundle/ruby/1.9.1/gems/dor-services-4.13.0/lib/dor/models/describable.rb:41:in `generate_dublin_core'
/home/lyberadmin/common-accessioning/shared/bundle/ruby/1.9.1/gems/dor-services-4.13.0/lib/dor/models/publishable.rb:54:in `publish_metadata'
/home/lyberadmin/common-accessioning/releases/20140910190822/robots/accession/publish.rb:17:in `perform'
/home/lyberadmin/common-accessioning/shared/bundle/ruby/1.9.1/gems/lyber-core-3.2.4/lib/lyber_core/robot.rb:67:in `block in work'
/usr/local/rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/benchmark.rb:295:in `realtime'
/home/lyberadmin/common-accessioning/shared/bundle/ruby/1.9.1/gems/lyber-core-3.2.4/lib/lyber_core/robot.rb:66:in `work'
/home/lyberadmin/common-accessioning/shared/bundle/ruby/1.9.1/gems/lyber-core-3.2.4/lib/lyber_core/robot.rb:20:in `perform'

when sidekiq process times out, lyber-core does not catch the error and report it to workflow service

If a Sidekiq process for managing workers is shutdown gracefully, as happens when the timeout window is reached for hotswapping old processes for new ones after robot deployment, any work in progress when the job is killed will error in such a way that lyber-core doesn't trap it and report it to workflow service, leaving the workflow step in the started state, instead of putting it in a failed state. The job will actually hit the retry queue, but then workflow service will say that the job isn't queued when a robot picks up the job from Sidekiq, and then the job won't run. Here's an example in common-accessioning from 4:09 pm this afternoon, when the 4 day timeout started by this week's dependency update deployment was reached: https://app.honeybadger.io/projects/52894/faults/95009475/01H5BAMR6YHW10C8777YYY6TEV

Item druid:sz929gx7593 is not queued for checksum-compute (assemblyWF), but has status of 'started'. Will skip processing

I think there are likely many other ways that we can encounter that error message, and this is one of the newer ones (since we've only implemented Sidekiq hotswap in the last few months).

See also this Slack thread where we were discussing the aforementioned checksum-compute job, and figured out why it disappeared without us noticing an error at first: https://stanfordlib.slack.com/archives/C09M7P91R/p1689373245001619

cc @andrewjbtw

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.