GithubHelp home page GithubHelp logo

zutnop / telekom-workflow-engine Goto Github PK

View Code? Open in Web Editor NEW
25.0 7.0 14.0 761 KB

An embeddable Java framework for running long-lived business processes

License: MIT License

Java 94.81% CSS 5.15% Batchfile 0.04%

telekom-workflow-engine's Introduction

Introduction

This document gives an overview of the Telekom workflow engine. For more details, please read the wiki pages: https://github.com/zutnop/telekom-workflow-engine/wiki.

Workflow engine

The telekom-workflow-engine is a custom built embeddable technology that provides a runtime environment for long-lived business processes. The main features, and the meaning of life, of the engine are:

  • execute and manage long-lived automatic business processes (workflows)
  • provide support for executing existing service layer methods (business logic) in those processes
  • provide support for human interaction in those processes (manual human tasks)
  • listen to and react on signals/events coming from integrated external systems
  • guarantee robust error handling, transaction management, scalability and transparency
  • provide automated testing support for your workflows

Why yet another workflow engine?

The available open-source offerings were mostly based on drag and drop visual design, lots of XML files and the BPMN standard, which is great if you are building demos with less than 10 elements in the workflow or writing Gartner recommendations, but it quickly becomes a huge PITA if you are actually implementing and running complex real-world working software on those platforms and integrating this with your existing services layer.

The previous 10 years we were using BEA/Oracle Weblogic Integration platform versions 7.x, 8.x, 9.x (end-of-life, no support, black box, mystical problems, running instances dissapearing etc., stuck with Java 5 and other ancient technologies, workflow development is awfully slow and complex) and the last 3 years we were trying to find a new platform for our workflows. After having a brief look at Activiti, Bonita Open Solution and JBoss jBPM, we decided to try the jBPM 5.4. After several proof-of-concepts on the jBPM and after adding/changing/enhancing/implementing/identifying all the missing parts (clustering, timers, WS-HumanTasks, java tasks, admin console, REST interface) it became clear that this was NOT the way forward. The last drop to the glass came when we actually tried to implement a few of our simpler workflows, the developer experience was nowhere near good enough to be able to commit to this for the next 10 years (it was buggy, slow, complex, managed to break the definition in a way that required to start from the scratch because there was no way to undo or fix it in the IDE, etc.)!

It was decided to build a lightweight embeddable workflow engine that meets our requirements and plays nicely together with our technology stack to achieve:

  • at least 10% time saving when developing applications which work based on automated workflows (the workflow definition implementation part wins way more than 10%, but there is a lot of other stuff you need to do when building the whole application (data layer, business services, web appliction, analysis, testing, user education etc.) so the 10%+ was estimated for the total win)
  • robustness (no black box)
  • performance (platform is designed to do only what we need it to do)
  • low lifetime costs (no licence fees, open tehncologies which can easily be updated)

Technical vision

Engine overview

The engine implementation can be divided into three main parts:

  • core - provides the runtime environment for workflow execution (based on graph oriented programming) together with all the supporting services (clustering, persistence, error handling etc.)
  • API - interface (DSL) for writing your workflow definitions and plugins, more info in "Workflow implementations" paragraph
  • web - web console, REST services and JMX interface for monitoring and interacting with the running engine

An empty engine itself does not provide any value for the end users. To do something useful with the engine, you need to write your workflow definitions (via using the API). The engine implementation and the workflow definitions are to be kept in separate repositories to provide a clear distinction between the platform and the actual workflow business logic code. The engine, together with those workflow definitions, packaged as a *.war archive, will be deployed to Tomcat web server(s). When the web server is started, the engine spools up, reads the previous state from its DB and continues to execute the ongoing instances. The engine will keep executing the started workflow instances, persisting the new state when a wait state is reached.

Error handling

The execution of the long-lived workflows is a delicate process (must survive technical problems) and thus a lot of effort has gone into making the workflow engine very robust and bulletproof. The workflows are executed as transactions from one wait state to the next wait state. If the next wait state is successfully reached, then the new state is persisted and the transaction is commited. If an exception occurs, the transaction is rolled back, the execution is stored and the workflow will go into a frozen state, waiting for a human decision. After correcting the problem, the workflow can be continued from the last good wait state. This achieves:

  • good protection against infrastructure problems, e.g.:
    • database problems
    • webservice problems
    • external application errors
  • low impact of development errors - a bug, which causes a NullpointerException, freezes the instance; bug is fixed and released into production; workflow instance can be continued from the last wait state

Performance

The engine is built to provide high throughput and supports scalability in both dimensions:

  • vertical - faster CPU, more memory, increase the engine thread pool size to support more concurrency
  • horizontal - clustering and its dynamic management based on the Hazelcast framework; adding and removing cluster nodes does not distract the engine's work; the self-healing mechanism finds the dead nodes and redistributes it's pending work

Monitoring/management

The workflow engine publishes a web console, REST services and JMX interface for monitoring and interacting with the running engine. These services provide the following functionality:

  • overview of engine health and current work
  • list of workflow instances and human tasks
  • tools for managing the execution errors (cancel, unfreeze, etc)
  • tools for interacting with human tasks

Testing

The workflow engine is fully covered with unit and integration tests. And the workflow definitions can also be easily tested with JUnit tests for rapid development and debuging.

Workflow implementations

To implement an automatic workflow, you need to implement the workflow process/rules in Java using the telekom-workflow-api interfaces. Each workflow is described in its own Java class as one or more steps.

Workflow definition example:

factory
    .start()
    
    // validate input
    .validateInputVariable( 0, "customerId", String.class)
    
    // load data into environment
    .variable( "customerBalance" ).call( 1, "customerService", "getCustomerBalance", "${customerId}" )
    // calculate the suspend time after the warning message has been sent
    .variable( "suspendTime" ).call( 2, "customerService", "getSuspendTimeAfterWarning" )

    .whileDo( 3, "customerBalance < 0 && suspendTime.time > System.currentTimeMillis()" )
        .split( 4 )
            .branch()
                // wait until suspendTime
                .waitUntilDate( 6, "${suspendTime}" )
            .branch()
                // and at the same time monitor balance changes
                .waitSignal( 7, "PAYMENT" )
        .joinFirst()
        .variable( "customerBalance" ).call( 8, "customerService", "getCustomerBalance", "${customerId}" )
    .whileDo()

    // only continue with the proceedings if the customer balance is still negative
    .if_( 9, "customerBalance < 0" )
        // create a suspend order
        .variable("suspendOrderId").callAsync( 10, "customerService", "suspendCustomer", "${customerId}" )
        
        // wait until the order is beeing processed
        .doWhile()
            .waitTimer( 11, "1000" )
            .variable( "suspendOrderStatus" ).call( 12, "customerService", "getOrderStatus", "${suspendOrderId}" )
        .doWhile( 13, "suspendOrderStatus == 'PROCESSING'" )
        
        // if the order fails, create a manual task
        .if_( 14, "suspendOrderStatus != 'COMPLETED'" )
            .humanTask( 15, "ROLE_CUSTOMER_SUPPORT", null ).withAttribute( "customerId", "${customerId}" ).withAttribute( "taskType", "MANUAL_SUSPEND" ).done()
        .endIf()
    
        // find out the next step 
        .variable( "nextStep" ).call( 16, "exampleStepSelector", "findNextStep", "${customerId}", "02" )
        //  and start it, passing along the customerId attribute
        .createInstance( 17, "${nextStep}", null, "${customerId}", null ).withAttribute( "customerId", "${customerId}" ).done()
    .endIf()

    .end();

telekom-workflow-engine's People

Contributors

alrikp avatar egonnaarits avatar jart avatar kuido85 avatar mrtnkbnn avatar vitalipetrov avatar zutnop avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

telekom-workflow-engine's Issues

How to exit workflow?

I need to exit workflow if condition is met:

factory.start()
   [...]
   .if_(condition)
      .end() //compile error
   .endif()
   [...]
.end();

Is this possible without nested if-blocks or throwing Exceptions?

Spring 5 support

Currently workflow engine does not support spring version 5. The class org.springframework.jmx.export.MBeanExporter used to inherit a method setRegistrationBehaviorName. But in Spring version 5 that method has been removed. The following exception is received when starting the application: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'exporter' defined in class path resource [workflow-engine-jmx.xml].

When last node in the cluster is shut down, but the queue is not empty, those instances will get "stuck"

I assume that when shutting down the last node of the cluster, the shutdown waits for the running consumers to shut down, but doesn't check/enforce that the queue is empty, and still saves that the node shutdown was clean.

  1. 10 items were added to queue
  2. 3 consumers retrieved 3 items from queue, and completed 3 items
  3. 3 consumers retrieved 3 items from queue, engine received shutdown signal, and completed 3 items, and the consumers threads were stopped after that
  4. queue contained 4 unprocessed items (these workflow_instances are locked in DB)
  5. node shutdown was "clean"
  6. node startup saw that the node is in "clean" state, and did NOT attempt a recovery run, and thus those 4 locked instances were left locked in DB, but as the queue is now empty from fresh startup, then they were never picked up by the producers/consumers

POTENTIAL SOLUTION: check if queue is not empty and if last/only node in cluster -> either unlock stuff in DB or mark the node "not clean shutdown", then the next startup would pick the work up again.

22:42:22,821 |INFO ||producer-1          |e.t.w.e.p.WorkProducerServiceImpl| Adding 'complete:67667/8010575' to queue
22:42:22,821 |INFO ||producer-1          |e.t.w.e.p.WorkProducerServiceImpl| Adding 'complete:67761/8010571' to queue
22:42:22,821 |INFO ||producer-1          |e.t.w.e.p.WorkProducerServiceImpl| Adding 'complete:67668/8010574' to queue
22:42:22,821 |INFO ||producer-1          |e.t.w.e.p.WorkProducerServiceImpl| Adding 'complete:67765/8010586' to queue
22:42:22,821 |INFO ||producer-1          |e.t.w.e.p.WorkProducerServiceImpl| Adding 'complete:67791/8010581' to queue
22:42:22,822 |INFO ||producer-1          |e.t.w.e.p.WorkProducerServiceImpl| Adding 'complete:68076/8010580' to queue
    22:42:22,822 |INFO ||producer-1          |e.t.w.e.p.WorkProducerServiceImpl| Adding 'complete:68782/8010592' to queue
    22:42:22,822 |INFO ||producer-1          |e.t.w.e.p.WorkProducerServiceImpl| Adding 'complete:68819/8010595' to queue
    22:42:22,822 |INFO ||producer-1          |e.t.w.e.p.WorkProducerServiceImpl| Adding 'complete:70419/8010589' to queue
    22:42:22,822 |INFO ||producer-1          |e.t.w.e.p.WorkProducerServiceImpl| Adding 'complete:71354/8010598' to queue

22:42:22,821 |INFO ||consumer-1          |e.t.w.e.c.WorkConsumerServiceImpl| Retrieved 'complete:67667/8010575' from queue.
22:42:22,821 |INFO ||consumer-2          |e.t.w.e.c.WorkConsumerServiceImpl| Retrieved 'complete:67761/8010571' from queue.
22:42:22,821 |INFO ||consumer-3          |e.t.w.e.c.WorkConsumerServiceImpl| Retrieved 'complete:67668/8010574' from queue.

22:42:23,191 |INFO |complete:67667/8010575|consumer-1          |e.t.w.e.WorkflowExecutorImpl| Completed
22:42:23,192 |INFO |complete:67761/8010571|consumer-2          |e.t.w.e.WorkflowExecutorImpl| Completed
22:42:23,192 |INFO |complete:67668/8010574|consumer-3          |e.t.w.e.WorkflowExecutorImpl| Completed

22:42:23,191 |INFO ||consumer-1          |e.t.w.e.c.WorkConsumerServiceImpl| Retrieved 'complete:67765/8010586' from queue.
22:42:23,192 |INFO ||consumer-3          |e.t.w.e.c.WorkConsumerServiceImpl| Retrieved 'complete:67791/8010581' from queue.
22:42:23,193 |INFO ||consumer-2          |e.t.w.e.c.WorkConsumerServiceImpl| Retrieved 'complete:68076/8010580' from queue.

22:42:23,456 |INFO |complete:68076/8010580|consumer-2          |e.t.w.e.WorkflowExecutorImpl| Completed
22:42:23,456 |INFO ||consumer-2          |e.t.w.e.c.WorkConsumerJobImpl| Stopped consumer on thread consumer-2
22:42:23,457 |INFO |complete:67791/8010581|consumer-3          |e.t.w.e.WorkflowExecutorImpl| Completed
22:42:23,457 |INFO ||consumer-3          |e.t.w.e.c.WorkConsumerJobImpl| Stopped consumer on thread consumer-3
22:42:23,463 |INFO |complete:67765/8010586|consumer-1          |e.t.w.e.WorkflowExecutorImpl| Completed
22:42:23,463 |INFO ||consumer-1          |e.t.w.e.c.WorkConsumerJobImpl| Stopped consumer on thread consumer-1

22:42:23,463 |INFO ||localhost-startStop-2|e.t.w.e.c.WorkConsumerJobImpl| Stopped all consumers
22:42:23,463 |DEBUG||localhost-startStop-2|e.t.w.e.q.HazelcastWorkQueue| Stopping queue
22:42:23,463 |INFO ||localhost-startStop-2|e.t.w.e.q.HazelcastWorkQueue| Stopped queue

Fix all javadoc errors

When building and releasing with java 8, the doclint generates lots of javadoc errors and thus the release builds fail.

Example bugs:

[ERROR] C:\TWE\telekom-workflow-engine\telekom-workflow-engine\src\main\java\ee\telekom\workflow\api\ElementUtil.java:28: warning: no @param for outputElement
[ERROR] public static OutputMapping createOutputMapping( Element outputElement ){
[ERROR] ^
[ERROR] C:\TWE\telekom-workflow-engine\telekom-workflow-engine\src\main\java\ee\telekom\workflow\api\ElementUtil.java:28: warning: no @return
[ERROR] public static OutputMapping createOutputMapping( Element outputElement ){
[ERROR] ^
[ERROR] C:\TWE\telekom-workflow-engine\telekom-workflow-engine\src\main\java\ee\telekom\workflow\api\ElementUtil.java:51: warning: no @param for arguments
[ERROR] public static InputMapping[] createArrayMapping( Object[] arguments ){ [ERROR] ^ [ERROR] C:\TWE\telekom-workflow-engine\telekom-workflow-engine\src\main\java\ee\telekom\workflow\api\ElementUtil.java:51: warning: no @return [ERROR] public static InputMapping[] createArrayMapping( Object[] arguments ){
[ERROR] ^

telekom-workflow-example with Spring 4 uses GsonHttpMessageConverter, which by default doesn't serialize NULL values

For example, when looking at workflow instances list, where a workflow instance's label2 value is null, browser will alert: "DataTables warning: table id=instancesTable - Requested unknown parameter 'label2' for row 0. For more information about this error, please see http://datatables.net/tn/4"

As the telekom-workflow-example runtime environment contains a GSON library (and doesn't contain Jackson libraries) then Spring MVC 4 configures the default GsonHttpMessageConverter, which contains a default Gson instance, where serializeNulls=false. Thus all null values are stripped from the response JSON.

It's not affecting our real application, because we use Jackson libraries and thus Spring uses Jackson message converter.

CORRECT FIX:

  1. Remove Spring 3 support classes, like GsonHttpMessageConverterForSpring3.
  2. Re-configure (or plug in a new instance) the Springs message converter. One way:
@Configuration
class WebMvcConfiguration extends WebMvcConfigurationSupport {
    @Override
    protected void extendMessageConverters(List<HttpMessageConverter<?>> converters) {
        for (HttpMessageConverter converter : converters) {
            if (converter instanceof GsonHttpMessageConverter) {
                Gson gson = new GsonBuilder().serializeNulls().create();
                ((GsonHttpMessageConverter)converter).setGson(gson);
            }
        }
    }
}

No overview of recent workflows - a critical business functionality missing

After upgrading from 1.0.3 to 1.2.6 the search functionality that opens when clicking on an item from "Workflows" tab no longer allows searching using no filter (message "Empty filter is not allowed"). This makes it very difficult to have an overview of workflows that have ran recently if there are a lot of definitions to choose from, causing considerable management overhead.

If having an empty filter is not acceptable the search should at least allow searching by a daterange or make it possbile to return limited set of latest workflow instances.

Hazelcast 3.2+ compatibility

From Hazelcast version 3.2 and up com.hazelcast.nio.serialization.DataSerializable does not extend java.io.Serializable anymore. For that reason serialization errors start to occur:

08:48:47.403 --- ERROR WorkProducerJobImpl.run(WorkProducerJobImpl.java:91) - java.io.NotSerializableException: ee.telekom.workflow.core.workunit.WorkUnit
org.apache.commons.lang3.SerializationException: java.io.NotSerializableException: ee.telekom.workflow.core.workunit.WorkUnit
at org.apache.commons.lang3.SerializationUtils.serialize(SerializationUtils.java:157) ~[commons-lang3-3.3.2.jar:3.3.2]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.