actionml / harness Goto Github PK

Harness is a Machine Learning/AI Server with plugins for many algorithms including the Universal Recommender

License: Apache License 2.0

Scala 90.44% Shell 5.92% Java 0.43% Python 2.49% Dockerfile 0.37% Makefile 0.35%

harness's Introduction

Harness Overview

This project implements a microservice based Machine Learning Server. It provides an API for plug-in Engines and implements all services needed for input and query. It is also the platform for the Universal Recommender, which is a Harness Engine.

Harness features include:

Architecture

Microservice Architecture Harness uses best-in-class services to achieve scalable performant operation and packages internal systems, where possible as lightweight microservices.
Single REST API comes with a Command Line Interface to make admin easier and SDKs for input and queries all accessible through a complete REST API.
Apache Spark Support Supports Spark optionally so engines with MLlib algorithms are easy to create and the Universal Recommender Engine uses Spark to create Models.
Compute-Engine Neutral: supports any compute engine or pre-packaged algorithm library that is JVM compatible. For example Spark, TensorFlow, Vowpal Wabbit, MLlib, Mahout etc. Does not require Spark, or HDFS but supports them with the Harness Toolbox.

Operation

Scalability: to very large datasets via indefinitely scalable compute engines like Spark and databases like MongoDB.
Realtime Input Validation: Engines supply realtime input validation.
Client Side Support: SDKs implement support for TLS/SSL and Auth and are designed for asynchronous non-blocking use.
- Java SDK: is supplied for Events and Queries REST endpoints
- Python SDK: implements client side support for all endpoints and is used by the Python CLI.
- REST without SSL and Auth is simple and can use any HTTP client—curl, etc. SSL and Auth complicate the client but the API is well specified.
Command Line Interface (CLI) is implemented as calls to the REST API and so is securely remotable. See the Harness-CLI repo for the Python implementation.
Async SDKs and Internal APIs: both our outward facing REST APIs as seen from the SDK and most of our internal APIs including those that communicate with Databases are based on Asynchronous/non-blocking IO.
Provisioning/Deployment can be done by time proven binary installation or using modern container methods:
- Containerized containerized provisioning using Docker and Docker-compose for easy installs or scalable container orchestration
- Container Orchestration ask us about Kubernetes support.
- Instance Creation optional compute and storage resource creation using cloud platform neutral tools like KOPs (Kubernetes for Ops)

Machine Learning Support

Flexible Learning Styles:
- Kappa: Online learners that update models in realtime
- Lambda: Batch learners that update or re-compute models in the background during realtime input and queries.
- Hybrids: The Universal Recommender (UR) is an example of a mostly Lambda Engine that will modify some parts of the model in realtime and will use realtime input to make recommendations but the heavy lifting (big data) part of model building is done in the background.
Mutable and Immutable Data Streams: can coexist in Harness allowing the store to meet the needs of the algorithm.
- Realtime Model Updates if the algorithm allows (as with the Universal Recommender) model updates can be made in realtime. This is accomplished by allowing some or all of the data to affect the model as it it received.
Immutable Data Stream TTLs even for immutable data streams the
Data Set Compatibility with Apache PredictionIO is supported where there is a matching Engine in both systems. For instance pio export <some-ur-app-id> produces JSON that can be directly read by the Harness UR via harness-cli import <some-ur-engine-id> <path-to-pio-export>. This is gives an easy way to upgrade from PredictionIO to Harness.

Security

Authentication support using server to server bearer tokens, similar to basic auth but from the OAuth 2.0 spec. Again on both Server and Client SDKs.
Authorization for Secure Multi-tenancy: runs multiple Engines with separate Permissions, assign "Client" roles for multiple engines, or root control through the "Admin" role.
- Multiple Engine Instances for any Engine Type: allow any number of variations on one Engine type or multiple Engine types.
- Multiple Tenants/Multiple Permissions: allow user+secret level access control to protect one "tenant" from another. Or Auth can be disabled to allow all Engines access without a user+secret for simple deployments.
TLS/SSL support from akka-http on the Server as well as in the Java and Python Client SDKs.
User and Permission Management: Built-in user+secret generation with permissions management at the Engine instance level.

Pre-requisites

In its simplest form Harness has few external pre-requisites. To run it on one machine the requirements are:

Docker-Compose

The Docker host tools for use with docker-compose or run it in a single container, dependent services installed separately. Prerequisites:

`nix style Host OS
Docker
Docker-compose

See the harness-docker-compose project for further information.

Source Build

For installation on a host OS such as an AWS instance without Docker, the minimum requirements are

Python 3 for CLI and Harness Python SDK
Pip 3 to add the SDK to the Python CLI
MongoDB 3.x
Some recent `nix OS
Services used by the Engine of your choice. For instance the UR requires Elasticsearch

Each Engine has its own requirements driven by decisions like what compute engine to use (Spark, TensorFlow, Vowpal Wabbit, DL4J, etc) as well as what Libraries it may need. See specific Engines for their extra requirements.

Architecture

At its core Harness is a fast lightweight Server that supplies 2 very flexible APIs and manages Engines. Engines can be used together or as a single solution.

Harness Core

Harness is a REST server with an API for Engines, Events, Queries, Users, and Permissions. Events and Queries end-points handle input and queries to Engine instances. The rest of the API is used for administrative functions and is generally accessed through the CLI.

The Harness Server core is small and fast and so is appropriate in single Algorithm solutions—an instance of The Universal Recommender is a good example. By adding the Harness Auth-server and scalable Engine backend services Harness can also become a full featured multi-tenant SaaS System.

Router

The Harness core is made from a component called a Router, which maintains REST endpoints for the various object collections. It routes all requests and responses to/from Engine Instances. It also supports SSL, OAuth2 signature based authentication, and REST route authorization where these are optional.

Algorithm specific Engines can be implemented without the need to deal REST APIs. Engines start by plugging into the abstract Engine API and inherit all server features including administration, input, and query APIs.

Administrator

The Administrator manages Engine Instances (and optionally Users and Permissions). With release 0.6.0 Harness will support service discovery via integrated Etcd. The Administrator integrates with the Harness Auth Server and Etcd to provide all admin features.

Engines

An Engine is the implementation of a particular algorithm. They are seen in com.actionml.core.engines module. Each Engine must supply required APIs but what they do when invoked is up to the Engine. The format of input and queries is also Engine specific so see the Engine's docs for more info. The inteface for Engines is designed based on years of implementing production worthy ML/AI algorithms and follows patterns that fit about all of the categories of algorithm types.

The common things about all Engines:

input is received as JSON "events" from REST POST requests or can be imported from a raw filesystem or the Hadoop Distributed File System (HDFS).
queries are JSON from REST POST requests.
query results are returned in the response for the request and is defined by the Engine type.
have JSON configuration files with generic Engine control params as well as Algorithm specific params. See the Engine docs for Algorithm params and Harness docs for generic Engine params.

Scaling

Harness is a stateless service. It does very little work itself. It delegates most work to the Engines and other Services that is depends on. These services are all best-in-class made to scale from a shared single machine to massive clusters seamlessly. Harness provides a toolbox of these scalable services that can be used by Engines for state, datasets, and computing.

Harness is scaled by scaling the services it and the various Engines use. For example The Universal Recommender will scale by using more or larger nodes for training (Spark), input storage (MongoDB), and model storage (Elasticsearch). In this manner Harness can be scaled either horizontally (for high availability) or vertically (for cost savings) to supply virtually unlimited resources to an Engine Instance.

Compute Engines

The use of scalable Compute Engines allow Big Data to be used in Algorithms. For instance one common Compute Engine is Apache Spark, which along with the Hadoop Distributed File System (HDFS) form a massively scalable platform. Spark and HDFS support are provided in the Harness Toolbox.

Input Store

Due the its extremely flexible data indexing, convenient TTL support, and massive scalability MongoDB is the default data store.

Server REST API

Harness REST is optionally secured by TLS and Authentication. This requires extensions to the typical REST API to support authentication control objects like Users and Roles, here we ignore these for simplicity.

Integral to REST is the notion of a "resource", which is an item that can be addressed by a resource-id. POSTing to a resource type creates a single resource. The resource types defined in Harness are:

engines: the engine is the instance of a Template, with associated knowledge of dataset, parameters, algorithms, models and all needed knowledge to Learn from the dataset to produce a model that will allow the engine to respond to queries.
- events: sub-collections that make up a particular dataset used a specific Engine. To send data to an Engine simply POST /engines/<engine-id>/events/ a JSON Event whose format is defined by the Engine. Non-reserved events (no $ in the name) can be thought of as a unending stream. Reserved eventa like $set may cause properties of mutable objects to be changed immediately upon being received and may even alter properties of the model. See the Engine description for how events are formatted, validated, and processed.
- queries: queries are created so engines can return information based on their models. See the Engine documentation for their formats.
- jobs: creates a long lived task such as a training task for a Lambda Engine.

For the full Harness REST API and response codes, see the Harness REST Specification

Input and Query

There are 2 primary APIs in the SDK for sending Engine events and making queries.

Disregarding the optional TLS and Authentication, simple input and queries for the UR look like this:

Example Input

curl -H "Content-Type: application/json" -d '
{
   "event" : "buy",
   "entityType" : "user",
   "entityId" : "John Doe",
   "targetEntityType" : "item",
   "targetEntityId" : "iPad",
   "eventTime" : "2019-02-17T21:02:49.228Z"
}' http://localhost:9090/engines/<some-engine-id>/events

Notice that this posts the JSON body to the http://localhost:9090/engines/<some-engine-id>/events endpoint.

Example Query

curl -H "Content-Type: application/json" -d '
{
  "user": "John Doe"
}' http://localhost:9090/engines/<some-engine-id>/queries

Notice that this posts the JSON body to the http://localhost:9090/engines/<some-engine-id>/queries endpoint.

Example Query Result

The result of the query will be in the response body and looks something like this:

{
    "result":[
        {"item":"Pixel Slate","score":1.4384104013442993},
        {"item":"Surface Pro","score":0.28768208622932434}
    ]
}

For specifics of the format and use of events and queries see the Engine specific documentation—for example The Universal Recommender.

Controlling and Communicating with Harness

The Harness CLI

The Harness server has admin type commands which are used to create and control the workflow of Engines and perform other Admin tasks. This CLI acts as a client to the REST API and so may be run remotely. The project lives in its own repo here
The Admin REST API A subset of the complete REST API implements all of the functionality of the Harness-CLI and so can be flexibly triggered remotely even without the CLI.
Security

Harness optionally supports SSL and token based Authentication with Authorization.
Java Client SDK

The Java SDK is currently source and build instructions. To use this you will include the source and required Java artifacts as shown in the examples then build them into your Java application. Only "Client" role operations are supported bu this SDK; input and query.
Python Client SDK

The CLI is implemented using the Python SDK so they are packaged together here
Config

To change this and other global behavior (not engine specific) read the Config Page

Installation

There are several ways to install and run Harness. The primary method we release is through container images but a source project is also maintained for thos who wish to use it.

Docker-Compose Installation to install an entire system using containers for all services on a single machine,
Source Build

Versions

See versions.md for released versions, or look in the "develop" branch for works in progress.

harness's People

Contributors

Stargazers

Watchers

harness's Issues

Add immediate setting of properties for $set to UR

the $set of properties in the UR should immediately modify the index after changing the property in the DB

harness create more than once produces ??? should be illegal

harness add <some-json-file>
harness add <some-json-file>

With the same file and engineId should fail. It seems to update, should report that a harness update is requied.

Harness Python SDK (maybe java too) revoke and grant allow only one resource permission

a user-id should be able to have admin for all resources or client for any number. Adding permission for several resources even it they are enumerated one at a time should be supported.

No required for first release.

running command "harness-cli train test_ur" throws error

System

using ubuntu 18.04
16 GB ram
512 SSD

Installed using the doc link Install harness.

What I'm doing

Steps:

harness-start
harness-cli add <path-to-engine-test_ur-json>
python3 import_mobile_device_ur_data_by_event_time.py
harness-cli train test_ur

After performing the 3rd step, I'm getting this error

Error

:01.177 INFO  NettyBlockTransferService - Server created on 192.168.1.2:35773
12:20:01.178 INFO  BlockManager      - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
12:20:01.195 INFO  BlockManagerMaster - Registering BlockManager BlockManagerId(driver, 192.168.1.2, 35773, None)
12:20:01.197 INFO  BlockManagerMasterEndpoint - Registering block manager 192.168.1.2:35773 with 2.1 GB RAM, BlockManagerId(driver, 192.168.1.2, 35773, None)
12:20:01.200 INFO  BlockManagerMaster - Registered BlockManager BlockManagerId(driver, 192.168.1.2, 35773, None)
12:20:01.200 INFO  BlockManager      - Initialized BlockManager: BlockManagerId(driver, 192.168.1.2, 35773, None)
12:20:01.208 INFO  ContextHandler    - Started o.s.j.s.ServletContextHandler@706874a{/metrics/json,null,AVAILABLE,@Spark}
12:20:01.251 INFO  URAlgorithm       - Engine-id: test_ur. Spark context spark.submit.deployMode: client
12:20:01.262 WARN  SparkContext      - Using an existing SparkContext; some configuration may not take effect.
12:20:01.468 INFO  MemoryStore       - Block broadcast_0 stored as values in memory (estimated size 8.6 KB, free 2.1 GB)
12:20:01.630 INFO  MemoryStore       - Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.0 KB, free 2.1 GB)
12:20:01.632 INFO  BlockManagerInfo  - Added broadcast_0_piece0 in memory on 192.168.1.2:35773 (size: 2.0 KB, free: 2.1 GB)
12:20:01.638 INFO  SparkContext      - Created broadcast 0 from broadcast at MongoSpark.scala:542
12:20:01.801 INFO  SharedState       - Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/home/dinesh/ml/harness/harnesscli/harness-cli/examples/spark-warehouse').
12:20:01.802 INFO  SharedState       - Warehouse path is 'file:/home/dinesh/ml/harness/harnesscli/harness-cli/examples/spark-warehouse'.
12:20:01.810 INFO  ContextHandler    - Started o.s.j.s.ServletContextHandler@1593bae9{/SQL,null,AVAILABLE,@Spark}
12:20:01.810 INFO  ContextHandler    - Started o.s.j.s.ServletContextHandler@4c840fc7{/SQL/json,null,AVAILABLE,@Spark}
12:20:01.811 INFO  ContextHandler    - Started o.s.j.s.ServletContextHandler@47ca1b15{/SQL/execution,null,AVAILABLE,@Spark}
12:20:01.812 INFO  ContextHandler    - Started o.s.j.s.ServletContextHandler@4a0406b{/SQL/execution/json,null,AVAILABLE,@Spark}
12:20:01.814 INFO  ContextHandler    - Started o.s.j.s.ServletContextHandler@5a20d5fd{/static/sql,null,AVAILABLE,@Spark}
12:20:02.091 INFO  StateStoreCoordinatorRef - Registered StateStoreCoordinator endpoint
12:20:02.096 WARN  SparkSession$Builder - Using an existing SparkSession; some configuration may not take effect.
12:20:02.100 INFO  MemoryStore       - Block broadcast_1 stored as values in memory (estimated size 8.7 KB, free 2.1 GB)
12:20:02.126 INFO  MemoryStore       - Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.0 KB, free 2.1 GB)
12:20:02.126 INFO  BlockManagerInfo  - Added broadcast_1_piece0 in memory on 192.168.1.2:35773 (size: 2.0 KB, free: 2.1 GB)
12:20:02.127 INFO  SparkContext      - Created broadcast 1 from broadcast at MongoSpark.scala:542
12:20:02.129 INFO  URPreparator$     - Indicators: List(purchase, view, category-pref, addtocart, transaction)
12:20:02.192 INFO  cluster           - Cluster created with settings {hosts=[localhost:27017], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=500}
12:20:02.201 INFO  cluster           - Cluster description not yet available. Waiting for 30000 ms before timing out
12:20:02.204 INFO  connection        - Opened connection [connectionId{localValue:3, serverValue:7}] to localhost:27017
12:20:02.205 INFO  cluster           - Monitor thread successfully connected to server with description ServerDescription{address=localhost:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[4, 2, 1]}, minWireVersion=0, maxWireVersion=8, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=602758}
12:20:02.206 INFO  MongoClientCache  - Creating MongoClient: [localhost:27017]
12:20:02.215 INFO  connection        - Opened connection [connectionId{localValue:4, serverValue:8}] to localhost:27017
12:20:02.250 ERROR URAlgorithm       - Spark computation failed for engine test_ur with params {{"engineId":"test_ur","engineFactory":"com.actionml.engines.ur.UREngine","sparkConf":{"master":"local","spark.driver-memory":"4g","spark.executor-memory":"4g"},"algorithm":{"indicators":[{"name":"purchase"},{"name":"view","maxCorrelatorsPerItem":50},{"name":"category-pref","maxCorrelatorsPerItem":50,"minLLR":5.0},{"name":"addtocart"},{"name":"transaction"}],"num":4}}}
java.lang.IllegalArgumentException: null
        at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
        at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
        at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
        at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:46)
        at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:449)
        at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:432)
        at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
        at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
        at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
        at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
        at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
        at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
        at org.apache.spark.util.FieldAccessFinder$$anon$3.visitMethodInsn(ClosureCleaner.scala:432)
        at org.apache.xbean.asm5.ClassReader.a(Unknown Source)
        at org.apache.xbean.asm5.ClassReader.b(Unknown Source)
        at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
        at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
        at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:262)
        at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:261)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:261)
        at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159)
        at org.apache.spark.SparkContext.clean(SparkContext.scala:2299)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2073)
        at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1364)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
        at org.apache.spark.rdd.RDD.take(RDD.scala:1337)
        at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply$mcZ$sp(RDD.scala:1472)
        at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply(RDD.scala:1472)
        at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply(RDD.scala:1472)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
        at org.apache.spark.rdd.RDD.isEmpty(RDD.scala:1471)
        at com.actionml.engines.ur.URPreparator$$anonfun$8.apply(URPreparator.scala:58)
        at com.actionml.engines.ur.URPreparator$$anonfun$8.apply(URPreparator.scala:58)
        at scala.collection.TraversableLike$$anonfun$filterImpl$1.apply(TraversableLike.scala:248)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at scala.collection.TraversableLike$class.filterImpl(TraversableLike.scala:247)
        at scala.collection.TraversableLike$class.filterNot(TraversableLike.scala:267)
        at scala.collection.AbstractTraversable.filterNot(Traversable.scala:104)
        at com.actionml.engines.ur.URPreparator$.mkTraining(URPreparator.scala:58)
        at com.actionml.engines.ur.URAlgorithm$$anonfun$train$1.apply(URAlgorithm.scala:262)
        at com.actionml.engines.ur.URAlgorithm$$anonfun$train$1.apply(URAlgorithm.scala:256)
        at scala.util.Success$$anonfun$map$1.apply(Try.scala:237)
        at scala.util.Try$.apply(Try.scala:192)
        at scala.util.Success.map(Try.scala:237)
        at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237)
        at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237)
        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
        at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:20:02.256 INFO  SparkContextSupport$ - Job 63becf80-4c6c-4cd4-9181-e0ffa278f37c completed in 1572591002256 ms [engine test_ur]
12:20:02.258 ERROR MongoAsyncDao     - Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(63becf80-4c6c-4cd4-9181-e0ffa278f37c)))
12:20:02.259 ERROR MongoAsyncDao     - Sync DAO error
java.lang.RuntimeException: Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(63becf80-4c6c-4cd4-9181-e0ffa278f37c)))
        at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:135)
        at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:129)
        at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253)
        at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251)
        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
12:20:02.263 INFO  AbstractConnector - Stopped Spark@7cd51c26{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
12:20:02.265 INFO  SparkUI           - Stopped Spark web UI at http://192.168.1.2:4040
12:20:02.266 ERROR AsyncEventQueue   - Listener JobManagerListener threw an exception
java.lang.RuntimeException: Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(63becf80-4c6c-4cd4-9181-e0ffa278f37c)))
        at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:135)
        at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:129)
        at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253)
        at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251)
        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)

Here is the data that i've of 'events' in mongodb

{ "_id" : ObjectId("5dbabcf41b62a3803294ace9"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "iPhone XS", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-28T18:04:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acec"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "iPhone XR", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-27T22:52:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acee"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "iPhone 8", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-27T03:40:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acf0"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "iPad Pro", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-26T08:28:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acf2"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "U 2", "targetEntityId" : "Pixel Slate", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-25T13:16:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acf4"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "U 2", "targetEntityId" : "Galaxy 8", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-24T18:04:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acf6"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-3", "targetEntityId" : "Surface Pro", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-23T22:52:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acf8"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "iPhone XR", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-23T03:40:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acfa"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "iPhone XR", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-22T08:28:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acfc"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "iPhone XR", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-21T13:16:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294acfe"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "iPhone XR", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-20T18:04:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad00"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "iPhone 8", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-19T22:52:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad02"), "eventId" : null, "event" : "purchase", "entityType" : "user", "entityId" : "u-4", "targetEntityId" : "Galaxy 8", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-19T03:40:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad04"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-18T08:28:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad06"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-17T13:16:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad08"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-16T18:04:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad0a"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-15T22:52:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad0c"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-15T03:40:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad0e"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Phones", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-14T08:28:36.518Z") }
{ "_id" : ObjectId("5dbabcf41b62a3803294ad10"), "eventId" : null, "event" : "view", "entityType" : "user", "entityId" : "u1", "targetEntityId" : "Mobile-acc", "dateProps" : {  }, "categoricalProps" : {  }, "floatProps" : {  }, "booleanProps" : {  }, "eventTime" : ISODate("2019-10-13T13:16:36.518Z") }

Is this error occuring because the 'eventId' is null ?

Harness doesn't create ES Index

When I want to create an Engine from my engine-template, Harness doesnt create an Index in Elasticsearch. Does anyone know why?

My sample engin.json looks like

{
	"engineId": "ecommerce",
	"engineFactory": "com.actionml.engines.ur.UREngine",
	"sparkConf": {
		"master": "local",
		"spark.driver-memory": "8g",
		"spark.executor-memory": "16g",
		"spark.serializer": "org.apache.spark.serializer.KryoSerializer",
		"spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
		"spark.kryo.referenceTracking": "false",
		"spark.kryoserializer.buffer": "300m",
		"spark.es.index.auto.create": "true",
		"es.index.auto.create": "true"
	},
	"algorithm": {
		"indicators": [
			{
				"name": "buy"
			},
			{
				"name": "detail-view"
			},
			{
				"name": "search-terms"
			}
		]
	}
}

Further in the logs the output shows that ES index name: is null

"blacklistEvents": [] not working

Hi!

I cannot figure out where do I go wrong. I am trying to remove the default blacklist of the primary indicator. Here is my engine.json:

Fresh docker deployment from the newest dev.

{
    "engineId": "ecom_ur",
    "engineFactory": "com.actionml.engines.ur.UREngine",
    "sparkConf": {
	"master": "local",
        "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
        "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
        "spark.kryo.referenceTracking": "false",
        "spark.kryoserializer.buffer": "300m",
        "spark.executor.memory": "20g",
        "spark.driver.memory": "10g",
        "spark.es.index.auto.create": "true",
        "spark.es.nodes": "192.168.40.225",
        "spark.es.nodes.wan.only": "true"
    },
    "algorithm":{
        "indicators": [ 
            {
                "name": "buy"
            }
        ],
	"maxQueryEvents": 19,
	"blacklistEvents": [],
	"maxCorrelatorsPerEventType": 49
    }
}

But it does not affect the generated engine... There is an issue also with the "maxQueryEvents" btw. Below is the log for the generated engine..

Engine-id: ecom_ur. Initialized with JSON: {"engineId":"ecom_ur","engineFactory":"com.actionml.engines.ur.UREngine","sparkConf":{"master":"local","spark.serializer":"org.apache.spark.serializer.KryoSerializer","spark.kryo.registrator":"org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator","spark.kryo.referenceTracking":"false","spark.kryoserializer.buffer":"300m","spark.executor.memory":"20g","spark.driver.memory":"10g","spark.es.index.auto.create":"true","spark.es.nodes":"192.168.40.225","spark.es.nodes.wan.only":"true"},"algorithm":{"indicators":[{"name":"buy"}],"maxQueryEvents":19,"blacklistEvents":[],"maxCorrelatorsPerEventType":49}}

10:39:51.697 INFO  URAlgorithm       - Engine-id: ecom_ur. Events to alias mapping: Map(buy -> buy)
10:39:51.697 INFO  package$          - 
╔════════════════════════════════════════════════════════════════════════════════╗
║ URAlgorithm initialization parameters including "defaults"                     ║
║ ══════════════════════════════════════════════════════════════════════════════ ║
║ ES index name:                          null                                   ║
║ ES type name:                           items                                  ║
║ RecsModel:                              all                                    ║
║ Indicators:                             List(IndicatorParams(buy,None,None,None,None)) ║
║ ══════════════════════════════════════════════════════════════════════════════ ║
║ Random seed:                            326194065                              ║
║ MaxCorrelatorsPerEventType:             49                                     ║
║ MaxEventsPerEventType:                  500                                    ║
║ BlacklistEvents:                        List(buy)                              ║
║ ══════════════════════════════════════════════════════════════════════════════ ║
║ User bias:                              1.0                                    ║
║ Item bias:                              1.0                                    ║
║ Max query events:                       1000                                   ║
║ Limit:                                  20                                     ║
║ ══════════════════════════════════════════════════════════════════════════════ ║
║ Rankings:                                                                      ║
║ popular                                 Some(popRank)                          ║
╚════════════════════════════════════════════════════════════════════════════════╝

I would really appreciate some insight. I tried all possible combination for the JSON.. ( [[]], [{}], [""], etc...)

The Elasticsearch Spark connector does not write to a remote Spark cluster

In the engine's JSON config:

    "sparkConf": {
        "master": "local",
        ...

The training succeeds and the ur-integration-test.sh passes

Changing to a remote master:

    "sparkConf": {
        "master": "spark://Maclaurin.local:7077",
        ...

The test fails and the index is not written to ES. This seems to be why there are not results, the index is never created so no query could succeed.

The part of main.log that has to do with Spark training is:

9:20:30.976 INFO  SparkContext      - Added JAR /Users/pat/harness/rest-server/Harness-0.4.0-SNAPSHOT/lib/commons-codec.commons-codec-1.10.jar at spark://192.168.1.5:52290/jars/commons-codec.commons-codec-1.10.jar with timestamp 1546658430976
19:20:30.978 INFO  StandaloneAppClient$ClientEndpoint - Connecting to master spark://Maclaurin.local:7077...
19:20:30.983 INFO  TransportClientFactory - Successfully created connection to Maclaurin.local/127.0.0.1:7077 after 1 ms (0 ms spent in bootstraps)
19:20:31.001 WARN  StandaloneAppClient$ClientEndpoint - Failed to connect to master Maclaurin.local:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
	at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
	at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
	at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.EOFException
	at java.io.DataInputStream.readFully(DataInputStream.java:197)
	at java.io.DataInputStream.readUTF(DataInputStream.java:609)
	at java.io.DataInputStream.readUTF(DataInputStream.java:564)
	at org.apache.spark.rpc.netty.RequestMessage$.readRpcAddress(NettyRpcEnv.scala:593)
	at org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:603)
	at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:662)
	at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:647)
	at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:187)
	at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:111)
	at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:745)

	at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:189)
	at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:120)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
	... 1 common frames omitted
19:20:39.302 INFO  StandaloneAppClient$ClientEndpoint - Connecting to master spark://Maclaurin.local:7077...
19:20:39.304 WARN  StandaloneAppClient$ClientEndpoint - Failed to connect to master Maclaurin.local:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
	at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
	at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
	at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.EOFException
	at java.io.DataInputStream.readFully(DataInputStream.java:197)
	at java.io.DataInputStream.readUTF(DataInputStream.java:609)
	at java.io.DataInputStream.readUTF(DataInputStream.java:564)
	at org.apache.spark.rpc.netty.RequestMessage$.readRpcAddress(NettyRpcEnv.scala:593)
	at org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:603)
	at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:662)
	at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:647)
	at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:187)
	at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:111)
	at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:745)

	at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:189)
	at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:120)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
	... 1 common frames omitted
19:20:50.982 INFO  StandaloneAppClient$ClientEndpoint - Connecting to master spark://Maclaurin.local:7077...
19:20:50.985 WARN  StandaloneAppClient$ClientEndpoint - Failed to connect to master Maclaurin.local:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
	at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
	at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
	at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.EOFException
	at java.io.DataInputStream.readFully(DataInputStream.java:197)
	at java.io.DataInputStream.readUTF(DataInputStream.java:609)
	at java.io.DataInputStream.readUTF(DataInputStream.java:564)
	at org.apache.spark.rpc.netty.RequestMessage$.readRpcAddress(NettyRpcEnv.scala:593)
	at org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:603)
	at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:662)
	at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:647)
	at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:187)
	at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:111)
	at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:745)

	at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:189)
	at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:120)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
	... 1 common frames omitted
19:20:59.304 ERROR StandaloneSchedulerBackend - Application has been killed. Reason: All masters are unresponsive! Giving up.
19:20:59.304 WARN  StandaloneSchedulerBackend - Application ID is not initialized yet.
19:20:59.305 INFO  ServerConnector   - Stopped Spark@5acb809d{HTTP/1.1}{0.0.0.0:4040}
19:20:59.306 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@7cb5c916{/stages/stage/kill,null,UNAVAILABLE,@Spark}
19:20:59.306 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@1d5d8d08{/jobs/job/kill,null,UNAVAILABLE,@Spark}
19:20:59.307 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@58d35a9c{/api,null,UNAVAILABLE,@Spark}
19:20:59.307 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@6bd1e619{/,null,UNAVAILABLE,@Spark}
19:20:59.308 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@15be96fb{/static,null,UNAVAILABLE,@Spark}
19:20:59.308 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@5b33cf09{/executors/threadDump/json,null,UNAVAILABLE,@Spark}
19:20:59.308 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@2472e86f{/executors/threadDump,null,UNAVAILABLE,@Spark}
19:20:59.309 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@1c0c53b6{/executors/json,null,UNAVAILABLE,@Spark}
19:20:59.309 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@130bb905{/executors,null,UNAVAILABLE,@Spark}
19:20:59.310 INFO  Utils             - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52297.
19:20:59.310 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@5295a565{/environment/json,null,UNAVAILABLE,@Spark}
19:20:59.310 INFO  NettyBlockTransferService - Server created on 192.168.1.5:52297
19:20:59.310 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@22b35eb6{/environment,null,UNAVAILABLE,@Spark}
19:20:59.310 INFO  BlockManager      - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
19:20:59.310 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@600df0b2{/storage/rdd/json,null,UNAVAILABLE,@Spark}
19:20:59.310 INFO  BlockManagerMaster - Registering BlockManager BlockManagerId(driver, 192.168.1.5, 52297, None)
19:20:59.311 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@415011fb{/storage/rdd,null,UNAVAILABLE,@Spark}
19:20:59.311 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@2e1bd9bc{/storage/json,null,UNAVAILABLE,@Spark}
19:20:59.311 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@324e164b{/storage,null,UNAVAILABLE,@Spark}
19:20:59.312 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@3cd3b5e6{/stages/pool/json,null,UNAVAILABLE,@Spark}
19:20:59.312 INFO  BlockManagerMasterEndpoint - Registering block manager 192.168.1.5:52297 with 1007.4 MB RAM, BlockManagerId(driver, 192.168.1.5, 52297, None)
19:20:59.312 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@37ab01ed{/stages/pool,null,UNAVAILABLE,@Spark}
19:20:59.312 INFO  BlockManagerMaster - Registered BlockManager BlockManagerId(driver, 192.168.1.5, 52297, None)
19:20:59.312 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@5fc8703e{/stages/stage/json,null,UNAVAILABLE,@Spark}
19:20:59.312 INFO  BlockManager      - Initialized BlockManager: BlockManagerId(driver, 192.168.1.5, 52297, None)
19:20:59.312 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@772f2790{/stages/stage,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@5e3aa92f{/stages/json,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@41c529db{/stages,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@433d6e4d{/jobs/job/json,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@383221d1{/jobs/job,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@3125d8b4{/jobs/json,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO  ContextHandler    - Stopped o.s.j.s.ServletContextHandler@22d415c4{/jobs,null,UNAVAILABLE,@Spark}
19:20:59.313 INFO  ContextHandler    - Started o.s.j.s.ServletContextHandler@27f75c54{/metrics/json,null,AVAILABLE,@Spark}
19:20:59.313 INFO  SparkUI           - Stopped Spark web UI at http://192.168.1.5:4040
19:20:59.314 ERROR SparkContext      - Error initializing SparkContext.
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.
This stopped SparkContext was created at:

org.apache.spark.SparkContext.<init>(SparkContext.scala:76)
com.actionml.core.spark.SparkContextSupport$$anonfun$createSparkContext$1.apply(SparkContextSupport.scala:125)
com.actionml.core.spark.SparkContextSupport$$anonfun$createSparkContext$1.apply(SparkContextSupport.scala:112)
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

The currently active SparkContext was created at:

(No active SparkContext.)
         
	at org.apache.spark.SparkContext.assertNotStopped(SparkContext.scala:100)
	at org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1679)
	at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2227)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:563)
	at com.actionml.core.spark.SparkContextSupport$$anonfun$createSparkContext$1.apply(SparkContextSupport.scala:125)
	at com.actionml.core.spark.SparkContextSupport$$anonfun$createSparkContext$1.apply(SparkContextSupport.scala:112)
	at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
	at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
	at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)

Can't set item properties

Hello,

I have installed harness via docker-compose and went through https://actionml.com/docs/h_ur_quickstart tutorial step by step.

Right now I have UR engine up and running and I can receive recommendations for a user.

Now I'm trying to set item properties and receiving the error "ERROR URDataset - Can't parse UREvent from JObject".

Full error message:
14:11:19.419 INFO ActorSystemImpl - Harness Server: HttpRequest(HttpMethod(POST), http://0.0.0.0:9090/engines/movie_fit/events, List(Accept: application/json, User-Agent: rest-client/2.1.0 (darwin19.0.0 x86_64) ruby/2.6.5p114, Accept-Encoding: gzip, deflate;q=0.6, identity;q=0.3, Host: 0.0.0.0:9090, Timeout-Access: <function1>),HttpEntity.Strict(application/json,{"event":"$set","entityType":"item","entityId":"1","properties":{"title_type":["movie"],"genres":["4","5"],"persons":["37","1","2","3","4","5","6","7","8","9"]}}),HttpProtocol(HTTP/1.1)) 14:11:19.424 ERROR URDataset - Can't parse UREvent from JObject(List((event,JString($set)), (entityType,JString(item)), (entityId,JString(1)), (properties,JObject(List((title_type,JArray(List(JString(movie)))), (genres,JArray(List(JString(4), JString(5)))), (persons,JArray(List(JString(37), JString(1), JString(2), JString(3), JString(4), JString(5), JString(6), JString(7), JString(8), JString(9))))))))) org.json4s.package$MappingException: Can't convert JNothing to class java.lang.String at org.json4s.CustomSerializer$$anonfun$deserialize$1.applyOrElse(Formats.scala:403) at org.json4s.CustomSerializer$$anonfun$deserialize$1.applyOrElse(Formats.scala:400) at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) at scala.collection.AbstractMap.applyOrElse(Map.scala:59) at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167) at org.json4s.Extraction$.customOrElse(Extraction.scala:600) at org.json4s.Extraction$.extract(Extraction.scala:387) at org.json4s.Extraction$.extract(Extraction.scala:39) at org.json4s.ExtractableJsonAstNode.extract(ExtractableJsonAstNode.scala:21) at com.actionml.core.validate.JsonSupport$DateReader$.read(JsonSupport.scala:44) at com.actionml.core.validate.JsonSupport$DateReader$.read(JsonSupport.scala:42) at org.json4s.ExtractableJsonAstNode.as(ExtractableJsonAstNode.scala:74) at com.actionml.engines.ur.URDataset.com$actionml$engines$ur$URDataset$$toUrEvent(URDataset.scala:238) at com.actionml.engines.ur.URDataset$$anonfun$input$1.apply(URDataset.scala:98) at com.actionml.engines.ur.URDataset$$anonfun$input$1.apply(URDataset.scala:98) at cats.data.Validated.andThen(Validated.scala:197) at com.actionml.engines.ur.URDataset.input(URDataset.scala:98) at com.actionml.engines.ur.UREngine$$anonfun$2.apply(UREngine.scala:113) at com.actionml.engines.ur.UREngine$$anonfun$2.apply(UREngine.scala:113) at cats.data.Validated.andThen(Validated.scala:197) at com.actionml.engines.ur.UREngine.input(UREngine.scala:113) at com.actionml.router.service.EventServiceImpl$$anonfun$receive$1.applyOrElse(EventService.scala:47) at akka.actor.Actor$class.aroundReceive(Actor.scala:502) at com.actionml.router.service.EventServiceImpl.aroundReceive(EventService.scala:35) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) at akka.actor.ActorCell.invoke(ActorCell.scala:495) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) at akka.dispatch.Mailbox.run(Mailbox.scala:224) at akka.dispatch.Mailbox.exec(Mailbox.scala:234) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 14:11:19.426 INFO UREngine - Engine-id: movie_fit. Bad input Whoops, no response string 14:11:19.430 INFO ActorSystemImpl - Harness Server: Response for

Please assist.

"Can't parse UREvent from..." while importing

I'm trying to import exported data from PIO + UR, but I get this error when it meets an event with non-empty properties. Looks like, Harness wrongly tries to decode nested fields in properties?

ERROR URDataset         - Can't parse UREvent from JObject(List((eventId,JString(AzNHWFcHuzpoYRf6IYtzXQAAAWtyVJTmqHaszTjE9tE)), (event,JString(FEED_LOADED)), (entityType,JString(account)), (entityId,JString(5b99b6fe520c627dac40cdae)), (properties,JObject(List((request,JObject(List((cursor,JNull), (limit,JInt(20)), (excludeIds,JArray(List())), (itemId,JString(5b60479c9b7e7425997fd083)), (orientation,JString(HORIZONTAL)), (clientInfo,JObject(List((application,JString(sandbox)), (appVersion,JObject(List((version,JString(0.3.2))))), (platform,JString(android)), (platformVersion,JObject(List((version,JString(8.1.0))))))))))), (feed,JObject(List((src,JObject(List((value,JString(ALSO_LIKE_HORIZONTAL)), (version,JString(v1))))), (ids,JArray(List(JString(5bbb229a24a24e05e22d757d), JString(5980640e981f94743ba6f2e8), JString(5c6164e8cfd54a435d7ee07e), JString(5a019543faf20e44181724f3), JString(5c9e10243b2b4409a29ee6e0), JString(5a4e3910520c624d8140809c), JString(5bd96e35dfa87c64648cadbf), JString(5b658546c4dc564e985ed908), JString(5a7d8b3024a24e6d074cc19f), JString(59848a6bf21c3a58a835a709), JString(59dde470f21c3a30ba5e49dd), JString(5b87acd4f1643f14005f15d7), JString(5b28ba703b2b442d2504d2cd), JString(5c936fcdc56a46246debfffd), JString(5c5452d5775ea36be2e1e9ef), JString(5a636ab6520c6256a319c420), JString(5c1a0ade581445366a2bed66), JString(5a6c3d5cc4dc565da74afed7), JString(5cb07221305a953825346406), JString(5b695c9bdfa87c43e438c457), JString(5bf7d47244106437f68f5af1), JString(5c98d4a6441064047e0e4beb), JString(5b923a4c24a24e1181add7c3), JString(594d11b7139771578a8949ed)))), (scores,JArray(List(JDouble(-1.0), JDouble(-1.0), JDouble(4.0), JDouble(-1.0), JDouble(4.0), JDouble(-1.0), JDouble(4.0), JDouble(4.0), JDouble(4.0), JDouble(4.0), JDouble(5.0), JDouble(4.0), JDouble(4.0), JDouble(-1.0), JDouble(4.0), JDouble(4.0), JDouble(4.0), JDouble(5.0), JDouble(-1.0), JDouble(4.0), JDouble(4.0), JDouble(-1.0), JDouble(4.0), JDouble(4.0)))))))))), (eventTime,JString(2019-06-20T00:41:14.214Z)), (creationTime,JString(2019-06-20T00:41:14.214Z))))
java.lang.RuntimeException: Can't parse date JString(5b60479c9b7e7425997fd083)
...
15:41:22.392 INFO  UREngine          - Bad input  Whoops, no response string 
15:41:22.392 ERROR URDataset         - Can't parse date HORIZONTAL
java.time.format.DateTimeParseException: Text 'HORIZONTAL' could not be parsed at index 0

The events without properties are being imported with no problems.

UR results return "ranks": null

no ranks should be returned if the score it null, which means they have not been calculated or configured. Need to decide what to do when they are configured but there is no value.

DAO exception finding all in empty collection

getting exceptions for findMany with no filter on an empty collection. Shouldn’t it return an empty iterable?

blacklistEvents renamed to blacklistIndicators

In this commit "polish docs, move to docs dir":

aa88533#diff-5778ab63eb5a74b5b25b27d166c80b32

Algorithm parameter blacklistEvents was renamed to blacklistIndicators without changing the documentation. I understand it was renamed to match the indicators config. Why was doc not updated? I discovered that blacklistEvents were not showing up in engine status and had to check the coding.

See https://actionml.com/docs/h_ur_config#complete-ur-engine-configuration-specification
"blacklistEvents": ["", "", "", ""],

Author @pferrel

Timeout warning when creating a user, using auth

The following appears in logs when creating a user using auth:

15:01:05.050 WARN  ActorSystemImpl   - HTTP header 'Timeout-Access: <function1>' is not allowed in requests
15:01:05.077 INFO  ActorSystemImpl   - Complete: HttpMethod(POST):http://localhost:9090/auth/users -> 200 OK [65 ms.]

eventIds are not allowed in REST routes

We should not assume an eventId is known of any event so any route that uses it is invalid.

See this PR: #17

dist directory in tarball is redundant

when running make-dist for auth server and harness there is a "dist" directory with all other root directories mirrored. this is confusing, bloats the dist tarball, and shoud be removed.

test automation

need lots more integration tests

Use Denny's make setup to do more int tests for

full admin cli
tests for each engine's response to cli, import, train etc
algorithm results tests
TLS?, Auth?
Java SDK, Python SDK?
other

may want to have other targets like no-auth-server build, etc.

Add DAO annotation for a field that is used for the "id" of the collection

Assuming a case class is used to store data in a collection, allow one field to be used as the unique id by annotating it with @id basically following what Morphia does with annotations, which in turn follows some standard applied to Java use of DB (can't remember which one)

This is very important because many case classes have been using _id to define the key in a case class and not sure that is still followed not that we are not using Salat & Casbah.

Better startup status checks

harness start and harness status print various settings but do no real status checks for the services that are required.

Status for Mongo. Elasticsearch, and anything the engine's want to provide should be listed.

These will be several different types:

Harness Required services, Mongo, Auth-server (optional)
Harness Toolbox services, Elasticsearch
Engine services such as VW for the CB. We need an Engine API that probes these

Harness fails with mongodb authentication

We are using harness server container to attach to authenticated Mongo DB.

We have tried multiple variations with sparkConf like:

#1
"spark.mongodb.input.uri": "mongodb://username:[email protected]/",
"spark.mongodb.input.database": "database-name",
"spark.mongodb.input.collection": "items",
"spark.mongodb.input.readPreference.name": "primaryPreferred"
#2 "spark.mongodb.input.uri":"mongodb://username:[email protected]:27017/database-name/?authSource=database-name",
"spark.mongodb.output.uri":"mongodb://username:[email protected]:27017/database-name/?authSource=database-name",
#3
"spark.mongodb.input.uri":"mongodb://username:[email protected]:27017/database-name",
"spark.mongodb.output.uri":"mongodb://username:[email protected]:27017/database-name"

But whenever we start training we are getting the same error :
WARNING: Partitioning failed.
Partitioning using the 'DefaultMongoPartitioner$' failed.
com.mongodb.MongoException: host and port should be specified in host:port format
at com.mongodb.ServerAddress.(ServerAddress.java:123)
at com.mongodb.internal.connection.ServerAddressHelper.createServerAddress(ServerAddressHelper.java:33)
at com.mongodb.internal.connection.ServerAddressHelper.createServerAddress(ServerAddressHelper.java:26)

We are not getting this error when we use database without authentication.

Google groups thread : https://groups.google.com/forum/#!topic/actionml-user/kqOF0tLO0o0

Thanks,
Nikola

MongoDB indexes not being created

I tried with Harness 0.4.0 & 0.5.0 and MongoDB indexes are not being created when a new UR engine is added.

I debugged the code and the method createIndexes is never executed in URDataset.init or MongoStore because ttl is always None.

I had to set a value for ttl in the ur-engine.json to have the indexes created. But if someone doesn't want a ttl as me, indexes won't be created.

Travis-CI Link Update

I think the original name of the project is called pio-kappa?
Cus the Travis-CI link is pointing to that and is not working for harness right now.

send Engine config from CLI host

harness add <some-file>

actually sends the path to the server instead of the contents of the JSON. Sending the JSON makes the CLI (except start/stop) run completely remote.

Make Mongo Administrator use case classes instead of Mongo Documents for engine persistence

Make Mongo Administrator use case classes instead of Mongo Documents for engine persistence.

Remove all use of the Mongo specific Document type.

harness update silent error

updating should have an engineId in the CLI so it can be checked against the one in the config otherwise you can easily update the wrong engine. Also the new config should be reported to the CLI and REST response.

The Spark Context code does not correctly detect a failed Spark Job

There seem to be many cases where the creation of a SparkContext errors out so that the Spark Job is not created. Therefore the Spark Job does not create an error so the harness train never errors.

This causes the queued job to stay queued forever so that fixing the error is futile since a new job will never be run.

ALL conditions that fail to start a Spark Job OR cause the Job to return an error should dequeue the Harness JobManager job. This will allow for fixing the problem and re-executing harness train ...

harness json file link 404 error

The link to harness JSON file in harness/docs/engine_configuration.md is broken.

hinting engine hints ids that have not been navigated to

This appears to be a case where the query returns one of the eligible ids instead of a model id

item based recs return the item itself

item-based recs should return the item itself only when configured to do so. Probably a missed mustNot value

add to integration test

hinting engine with nav-id = URL

if a nav-id is a URL there is a DB error "bad field name". This appears to be a Casbah bug that requires storing jouneys some new way

Exception when deleting an Engine that uses Elasticsearch

If Elasticsearch is not running harness delete <some-enigine-id> will throw an exception because the delete of the model fails. This should test to see if ES is running and report that it is not as an error rather than an exception that gives no clue that ES is not running.

Should an init (Create or Update init) ping required services?

harness status engines <engine-id> should show all status

as the title says

The harness cli is broken when setting auto-start

The cli creates a pid file in the harness directory -- bad because when using systemctl on startup the user is root

The harness-start cli also checks to see if a pid file exists and requires a flag to start -- bad, we should not use pid files at all but see if harness is running using ps or just fail to start if something is bound to the port we need and not check pids.

The harness-start script calls harness-status, which doesn't work unless harness is installed in the root user's path when using systemctl

CLI cleanup

first merge CI mods to the CLI Bash scripts
then migrate all CLI into Python so it can be packaged as a single script.

JSON compression for REST response, prettify in CLI

The best way to handle JSON responses is to compress over the wire and prettify for display in the CLI or for server logs.

This is now prettified in the response

Need so better check services on startup

harness status called by harness start should do a health check of the required and toolbox services and report this.

Auth Server prints message over an over

The following message is printed continuously in the logs when the auth server is running, is this needed?

15:01:13.622 DEBUG cluster - Checking status of localhost:27017
15:01:13.624 DEBUG cluster - Updating cluster description to  {type=STANDALONE, servers=[{address=localhost:27017, type=STANDALONE, roundTripTime=0.8 ms, state=CONNECTED}]
15:01:23.628 DEBUG cluster - Checking status of localhost:27017
15:01:23.629 DEBUG cluster - Updating cluster description to  {type=STANDALONE, servers=[{address=localhost:27017, type=STANDALONE, roundTripTime=0.8 ms, state=CONNECTED}]
15:01:33.633 DEBUG cluster - Checking status of localhost:27017
15:01:33.634 DEBUG cluster - Updating cluster description to  {type=STANDALONE, servers=[{address=localhost:27017, type=STANDALONE, roundTripTime=0.8 ms, state=CONNECTED}]
15:01:43.637 DEBUG cluster - Checking status of localhost:27017
15:01:43.639 DEBUG cluster - Updating cluster description to  {type=STANDALONE, servers=[{address=localhost:27017, type=STANDALONE, roundTripTime=0.9 ms, state=CONNECTED}]

ES 6.X queries seem broken compared to ES 5.x

The same bool/should query return different results in ES5 vs ES6. This difference is quite bad for the UR.

Short-term solution: downgrade to use 5.x
Long-term solution: find the new way to do this query in 6.x+

See this SO: https://stackoverflow.com/questions/54335922/elasticsearch-6-5-query-result-scoring-seems-wrong

MongoAsyncDao - Sync DAO error after Spark Job finished

General Setup

System: Ubuntu 18.04.3 LTS
Using docker-compose as described here.

Steps

docker-compose exec harness-cli bash -c 'harness-cli add data/my-engine-config.json' (see Config)
call the REST API to generate some events (harness log and response looks fine)
call the REST API to train the model (Job finished, but MongoAsyncDao threw Exception)
call the REST API to query user/item (harness log and response looks fine)

Config

{
    "engineId": "retailrocket",
    "engineFactory": "com.actionml.engines.ur.UREngine",
    "sparkConf": {
        "master": "local",
        "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
        "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
        "spark.kryo.referenceTracking": "false",
        "spark.kryoserializer.buffer": "300m",
        "spark.executor.memory": "1g",
        "spark.driver.memory": "1g",
        "spark.es.index.auto.create": "true",
        "spark.es.nodes": "<my_ip_adress>:9200",
        "spark.es.nodes.wan.only": "true"
    },
    "algorithm":{
        "indicators": [
            {
                "name": "transaction"
            },{
                "name": "view"
            },{
                "name": "addtocart"
            }
        ]
    }
}

Log Output

10:03:40.561 INFO  ContextCleaner    - Cleaned accumulator 5556
10:03:40.561 INFO  ContextCleaner    - Cleaned accumulator 5452
10:03:40.561 INFO  ContextCleaner    - Cleaned accumulator 5386
10:03:40.561 INFO  ContextCleaner    - Cleaned accumulator 5500
10:03:40.561 INFO  ContextCleaner    - Cleaned accumulator 5476
10:03:40.617 INFO  Executor          - Finished task 0.0 in stage 143.0 (TID 59). 1181 bytes result sent to driver
10:03:40.619 INFO  TaskSetManager    - Finished task 0.0 in stage 143.0 (TID 59) in 1243 ms on localhost (executor driver) (1/1)
10:03:40.619 INFO  TaskSchedulerImpl - Removed TaskSet 143.0, whose tasks have all completed, from pool
10:03:40.621 INFO  DAGScheduler      - ResultStage 143 (runJob at EsSpark.scala:108) finished in 1.267 s
10:03:40.625 INFO  DAGScheduler      - Job 31 finished: runJob at EsSpark.scala:108, took 1.282262 s
10:03:40.803 INFO  SparkContextSupport$ - Job f3f1f145-e4f9-44dd-a713-28587fa076d5 completed in 1573121020802 ms [engine retailrocket]
10:03:40.806 INFO  AbstractConnector - Stopped Spark@6454a6a2{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
10:03:40.813 INFO  SparkUI           - Stopped Spark web UI at http://f4f9bbcb5baa:4040
10:03:40.816 ERROR MongoAsyncDao     - Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(f3f1f145-e4f9-44dd-a713-28587fa076d5)))
10:03:40.818 ERROR MongoAsyncDao     - Sync DAO error
java.lang.RuntimeException: Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(f3f1f145-e4f9-44dd-a713-28587fa076d5)))
        at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:135)
        at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:129)
        at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253)
        at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251)
        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
10:03:40.820 ERROR AsyncEventQueue   - Listener JobManagerListener threw an exception
java.lang.RuntimeException: Can't removeOne from collection harness_meta_store.jobs with filter WrappedArray((jobId,Equals(f3f1f145-e4f9-44dd-a713-28587fa076d5)))
        at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:135)
        at com.actionml.core.store.backends.MongoAsyncDao$$anonfun$removeOneAsync$1.apply(MongoAsyncDao.scala:129)
        at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253)
        at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251)
        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
10:03:40.824 INFO  MapOutputTrackerMasterEndpoint - MapOutputTrackerMasterEndpoint stopped!
10:03:40.851 INFO  MemoryStore       - MemoryStore cleared
10:03:40.851 INFO  BlockManager      - BlockManager stopped
10:03:40.853 INFO  BlockManagerMaster - BlockManagerMaster stopped
10:03:40.854 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint - OutputCommitCoordinator stopped!
10:03:40.865 INFO  SparkContext      - Successfully stopped SparkContext
10:03:41.397 INFO  MongoClientCache  - Closing MongoClient: [mongo:27017]
10:03:41.400 INFO  connection        - Closed connection [connectionId{localValue:34, serverValue:34}] to mongo:27017 because the pool has been closed.```

Make Harness work with load balancer

Synchronize Harness server means that all must know when one has updated the metastore. I think this means minimally:

one Harness server needs to know all others. maybe by one being the temp master who maintains the db of all others? Is there a simpler way? Like using Etcd or Consul?
load balancing handled by Kubernetes or regular load balancer but sync handled by the Harness cluster
Harness could have a sync REST API, which tells Harness to re-sync with the metastore
sync sent to all harness servers when:
- any server executes harness add, harness update, harness delete, or ???

Not sure of what else should be done. Any cluster of Harness needs to keep track if all others so probably a master voting structure like ES, where one keeps the DB of all other server, and if the master is not available another is voted to take over.

This seems pretty complicated so is there an easier solution?

Are there signaling DBs that will notify listeners when something changes?

Engine Update exception

harness add examples/data/ur_nav_hinting.json
harness update examples/data/ur_nav_hinting.json

You get an exception in MongoDao. This blocks being able to test updates that change config since it doesn't work even when config is not changed.

15:30:34.319 ERROR OneForOneStrategy - Can't update collection harness_meta_store.engines with filter WrappedArray((engineId,ur_nav_hinting)) and value Document((engineFactory,BsonString{value='com.actionml.engines.urnavhinting.URNavHintingEngine'}), (params,BsonString{value='{
  "engineId" : "ur_nav_hinting",
  "engineFactory" : "com.actionml.engines.urnavhinting.URNavHintingEngine",
  "mirrorContainer" : "/tmp/mirrors",
  "mirrorType" : "localfs",
  "sparkConf" : {
    "master" : "local",
    "spark.driver-memory" : "4g",
    "spark.executor-memory" : "4g",
    "spark.serializer" : "org.apache.spark.serializer.KryoSerializer",
    "spark.kryo.registrator" : "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
    "spark.kryo.referenceTracking" : "false",
    "spark.kryoserializer.buffer" : "300m",
    "es.index.auto.create" : "true"
  },
  "algorithm" : {
    "comment" : "may not need indexName and typeName, derive from engineId? but nowhere else to get the RESTClient address",
    "esMaster" : "es-node-1",
    "blacklist" : [
    ],
    "indicators" : [
      {
        "name" : "nav-event"
      },
      {
        "name" : "search-terms"
      },
      {
        "name" : "content-pref"
      }
    ],
    "availableDateName" : "available",
    "expireDateName" : "expires",
    "dateName" : "date",
    "num" : 6
  }
}'}))

Get complete Item object from query

Is there a way to get the complete item object as a response to the query and not just item ids?

escaped JSON in query responses

All REST responses are escaped as if they were just strings being returned as JSON. They are actually JSON and should not be re-escaped

Getting a Spark Context takes too long

running harness train <engineId> with a remote cluster make take a lot of time, many seconds, to copy jars to the context. This causes the harness command, which uses the REST API to timeout before returning the response.

So harness train times out the CLI and I believe times out the REST API too.

This should be in a Future as it was when it was first implemented.

The Future will eventually create and return an SC which itself execute code in the context all in the background.

The CLI and REST API to train should return immediately with the Harness JobID, then wait until the sc is ready, then execute the block of code with the sc.

Include filter rule returning empty list

Hi!

Using 0.5.1.

I'm trying to use the include filter rule, but I keep getting an empty result. I used a $set event to set a category on two items that exist in my data as such:

{
   "event" : "$set",
   "entityType" : "item",
   "entityId" : "exampleItem",
   "properties" : {
      "category": ["electronics", "mobile"],
      "expireDate": "2020-10-05T21:02:49.228Z"
   },
   "eventTime" : "2019-12-17T21:02:49.228Z"
}

Then I ran the train command and then this query:

{
   "rules": [
    {
      "name": "category",
      "values": ["electronics"],
      "bias": -1
    }]
}

but got an empty list as result.
The other rules (bias = 0 and > 0) seem to be working as specified.

Adding the same Engine-id

harness add <engine-config-json> should give an error if the engine-id is already in use and suggest using harness update <engine-config-json>

SSL Support for MongoDB connection

I tried to use Azure Cosmos DB as alternative for MongoDB for harness. CosmosDB uses SSL and authentication by default which is not supported in harness yet.

Error Message:
No SSL support in java.nio.channels.AsynchronousSocketChannel. For SSL support use com.mongodb.connection.netty.NettyStreamFactoryFactory
See attached log file
harness-mongodb-ssl-error.txt

Please change the mongodb driver to support SSL connections.

All item scores are 0.0

Hello,

I've used docker-compose to setup Harness + UR, went through workflow and get recommendations but all items have 0.0 scores in the result. What could go wrong?

The steps I did:

Docker-compose setup from https://github.com/actionml/harness-docker-compose/blob/develop/docker-compose.yml
Create a new engine with following template: {
"engineId": "movie_fit",
"engineFactory": "com.actionml.engines.ur.UREngine",
"sparkConf": {
"es.index.auto.create": "true",
"es.nodes": "elasticsearch",
"es.nodes.wan.only": "true",
"master": "local",
"spark.driver.memory": "512m",
"spark.es.index.auto.create": "true",
"spark.es.nodes": "elasticsearch",
"spark.es.nodes.wan.only": "true",
"spark.executor.memory": "512m",
"spark.kryo.referenceTracking": "false",
"spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
"spark.kryoserializer.buffer": "300m",
"spark.serializer": "org.apache.spark.serializer.KryoSerializer"
},
"returnSelf": "false",
"algorithm":{
"indicators": [
{
"name": "like"
},
{
"name": "wish"
},
{
"name": "dislike"
},
{
"name": "view"
},
{
"name": "like_genre"
},
{
"name": "like_person"
},
{
"name": "like_year"
},
{
"name": "like_content_rating"
}
]
}
}
Send events to engine. My app is the Movie app, so I generated the user+like, user+dislike events and successfully sent them to engine. Event data example:
{
event: "like",
entityType: "user",
entityId: "123",
targetEntityType: "item",
targetEntityId: "323",
eventTime: date_time.iso8601
}
Succsessfully trained model via Rest API call. POST "<harness_host>/jobs"
And the result is recommendations with 0.0 scores (POST "<harness_host>/queries"). This happenes regardless the requestiong recommendations for a user or just popular items.

Please assist

`harness stop` does not always work

harness stop which uses cat ${PIDFILE} | xargs kill does not always work. I have waited a long time and eventually must kill -9 ....

For now we can tell if the stop does not succeed by checking with jps and printing a warning. But the real fix is unknown. This seems to have something to do with the Contextual Bandit since it is that integration test that has trouble. This in turn may mean it is a VW integration problem, not sure.

spark running error : ERROR Inbox: Ignoring error java.io.EOFException

Spark Executor Command: "/usr/lib/jvm/java-1.8-openjdk/jre/bin/java" "-cp" "///conf:/spark/jars/*" "-Xmx2048M" "-Dspark.driver.port=44603" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@ff059f403060:44603" "--executor-id" "1" "--hostname" "172.29.0.5" "--cores" "2" "--app-id" "app-20190625031854-0000" "--worker-url" "spark://[email protected]:37435"

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/06/25 03:18:59 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 57@aae27ca361b0
19/06/25 03:18:59 INFO SignalUtils: Registered signal handler for TERM
19/06/25 03:18:59 INFO SignalUtils: Registered signal handler for HUP
19/06/25 03:18:59 INFO SignalUtils: Registered signal handler for INT
19/06/25 03:19:02 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/06/25 03:19:03 INFO SecurityManager: Changing view acls to: root
19/06/25 03:19:03 INFO SecurityManager: Changing modify acls to: root
19/06/25 03:19:03 INFO SecurityManager: Changing view acls groups to:
19/06/25 03:19:03 INFO SecurityManager: Changing modify acls groups to:
19/06/25 03:19:03 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
19/06/25 03:19:05 INFO TransportClientFactory: Successfully created connection to ff059f403060/172.29.0.2:44603 after 408 ms (0 ms spent in bootstraps)
19/06/25 03:19:06 INFO SecurityManager: Changing view acls to: root
19/06/25 03:19:06 INFO SecurityManager: Changing modify acls to: root
19/06/25 03:19:06 INFO SecurityManager: Changing view acls groups to:
19/06/25 03:19:06 INFO SecurityManager: Changing modify acls groups to:
19/06/25 03:19:06 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
19/06/25 03:19:06 INFO TransportClientFactory: Successfully created connection to ff059f403060/172.29.0.2:44603 after 5 ms (0 ms spent in bootstraps)
19/06/25 03:19:06 INFO DiskBlockManager: Created local directory at /tmp/spark-21895064-184b-4817-9988-c7e05776cbe8/executor-4423d906-96fc-4abc-a1ec-896b4b944506/blockmgr-58f7f70a-975f-465c-8dd5-a3473462f140
19/06/25 03:19:06 INFO MemoryStore: MemoryStore started with capacity 912.3 MB
19/06/25 03:19:07 INFO CoarseGrainedExecutorBackend: Connecting to driver: spark://CoarseGrainedScheduler@ff059f403060:44603
19/06/25 03:19:07 INFO WorkerWatcher: Connecting to worker spark://[email protected]:37435
19/06/25 03:19:07 INFO TransportClientFactory: Successfully created connection to /172.29.0.5:37435 after 11 ms (0 ms spent in bootstraps)
19/06/25 03:19:07 INFO WorkerWatcher: Successfully connected to spark://[email protected]:37435
19/06/25 03:19:07 INFO CoarseGrainedExecutorBackend: Successfully registered with driver
19/06/25 03:19:07 INFO Executor: Starting executor ID 1 on host 172.29.0.5
19/06/25 03:19:08 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 40379.
19/06/25 03:19:08 INFO NettyBlockTransferService: Server created on 172.29.0.5:40379
19/06/25 03:19:08 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
19/06/25 03:19:08 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(1, 172.29.0.5, 40379, None)
19/06/25 03:19:08 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(1, 172.29.0.5, 40379, None)
19/06/25 03:19:08 INFO BlockManager: Initialized BlockManager: BlockManagerId(1, 172.29.0.5, 40379, None)
19/06/25 03:19:08 ERROR Inbox: Ignoring error
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readUTF(DataInputStream.java:609)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at org.apache.spark.scheduler.TaskDescription$$anonfun$decode$1.apply(TaskDescription.scala:134)
at org.apache.spark.scheduler.TaskDescription$$anonfun$decode$1.apply(TaskDescription.scala:133)
at scala.collection.immutable.Range.foreach(Range.scala:160)
at org.apache.spark.scheduler.TaskDescription$.decode(TaskDescription.scala:133)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:96)
at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:117)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:205)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:101)
at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Train job stuck executing

I've followed the quick start guide: https://actionml.com/docs/h_ur_quickstart

My config.json is as follows:

{
    "engineId": "2",
    "engineFactory": "com.actionml.engines.ur.UREngine",
    "sparkConf": {
        "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
        "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
        "spark.kryo.referenceTracking": "false",
        "spark.kryoserializer.buffer": "300m",
        "spark.executor.memory": "3g",
        "spark.driver.memory": "3g",
        "spark.es.index.auto.create": "true",
        "spark.es.nodes": "harness-docker-compose_elasticsearch_1",
        "spark.es.nodes.wan.only": "true"
    },
    "algorithm":{
        "indicators": [
            {
                "name": "buy"
            },{
                "name": "view"
            }
        ]
    }
}

I run harness-cli train 2 and when I check harness-cli status engines 2 I see:

/harness-cli/harness-cli/harness-status: line 10: /harness-cli/harness-cli/RELEASE: No such file or directory
Harness CLI v settings
==================================================================
HARNESS_CLI_HOME ........................ /harness-cli/harness-cli
HARNESS_CLI_SSL_ENABLED .................................... false
HARNESS_CLI_AUTH_ENABLED ................................... false
HARNESS_SERVER_ADDRESS ................................... harness
HARNESS_SERVER_PORT ......................................... 9090
==================================================================
Harness Server status: OK
Status for engine-id: 2
{
    "engineParams": {
        "algorithm": {
            "indicators": [
                {
                    "name": "buy"
                },
                {
                    "name": "view"
                }
            ]
        },
        "engineFactory": "com.actionml.engines.ur.UREngine",
        "engineId": "2",
        "sparkConf": {
            "spark.driver.memory": "3g",
            "spark.es.index.auto.create": "true",
            "spark.es.nodes": "harness-docker-compose_elasticsearch_1",
            "spark.es.nodes.wan.only": "true",
            "spark.executor.memory": "3g",
            "spark.kryo.referenceTracking": "false",
            "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
            "spark.kryoserializer.buffer": "300m",
            "spark.serializer": "org.apache.spark.serializer.KryoSerializer"
        }
    },
    "jobStatuses": {
        "ed0becbf-ce7d-4e62-822d-2e3f2138e235": {
            "comment": "Spark job",
            "jobId": "ed0becbf-ce7d-4e62-822d-2e3f2138e235",
            "status": {
                "name": "executing"
            }
        }
    }
}

The job never moves from executing, is there anyway I can debug why this is happening?

I've set harness up by following https://actionml.com/docs/harness_container_guide

actionml / harness Goto Github PK

harness's Introduction

Harness Overview

Architecture

Operation

Machine Learning Support

Security

Pre-requisites

Docker-Compose

Source Build

Architecture

Harness Core

Router

Administrator

Engines

Scaling

Compute Engines

Input Store

Server REST API

Input and Query

Controlling and Communicating with Harness

Installation

Versions

harness's People

Contributors

Stargazers

Watchers

Forkers

harness's Issues

System

What I'm doing

Error

General Setup

Steps

Config

Log Output

Recommend Projects

Recommend Topics

Recommend Org

Jobs