GithubHelp home page GithubHelp logo

apache / openwhisk-deploy-kube Goto Github PK

View Code? Open in Web Editor NEW
296.0 43.0 228.0 1.21 MB

The Apache OpenWhisk Kubernetes Deployment repository supports deploying the Apache OpenWhisk system on Kubernetes and OpenShift clusters.

Home Page: https://openwhisk.apache.org/

License: Apache License 2.0

Shell 50.23% Python 3.65% Smarty 46.12%
openwhisk apache serverless faas functions-as-a-service cloud serverless-architectures serverless-functions docker kubernetes

openwhisk-deploy-kube's Introduction

OpenWhisk Deployment on Kubernetes

License Build Status Join Slack

Apache OpenWhisk is an open source, distributed Serverless platform that executes functions (fx) in response to events at any scale. The OpenWhisk platform supports a programming model in which developers write functional logic (called Actions), in any supported programming language, that can be dynamically scheduled and run in response to associated events (via Triggers) from external sources (Feeds) or from HTTP requests.

This repository supports deploying OpenWhisk to Kubernetes and OpenShift. It contains a Helm chart that can be used to deploy the core OpenWhisk platform and optionally some of its Event Providers to both single-node and multi-node Kubernetes and OpenShift clusters.

Table of Contents

Prerequisites: Kubernetes and Helm

Kubernetes is a container orchestration platform that automates the deployment, scaling, and management of containerized applications. Helm is a package manager for Kubernetes that simplifies the management of Kubernetes applications. You do not need to have detailed knowledge of either Kubernetes or Helm to use this project, but you may find it useful to review their basic documentation to become familiar with their key concepts and terminology.

Kubernetes

Your first step is to create a Kubernetes cluster that is capable of supporting an OpenWhisk deployment. Although there are some technical requirements that the Kubernetes cluster must satisfy, any of the options described below is acceptable.

Simple Docker-based options

The simplest way to get a small Kubernetes cluster suitable for development and testing is to use one of the Docker-in-Docker approaches for running Kubernetes directly on top of Docker on your development machine. Configuring Docker with 4GB of memory and 2 virtual CPUs is sufficient for the default settings of OpenWhisk. Depending on your host operating system, we recommend the following:

  1. MacOS: Use the built-in Kubernetes support in Docker for Mac version 18.06 or later. Please follow our setup instructions to initially create your cluster.
  2. Linux: Use kind. Please follow our setup instructions to initially create your cluster.
  3. Windows: Use the built-in Kubernetes support in Docker for Windows version 18.06 or later. Please follow our setup instructions to initially create your cluster.

Using a Kubernetes cluster from a cloud provider

You can also provision a Kubernetes cluster from a cloud provider, subject to the cluster meeting the technical requirements. You will need at least 1 worker node with 4GB of memory and 2 virtual CPUs to deploy the default configuration of OpenWhisk. You can deploy to significantly larger clusters by scaling up the replica count of the various components and labeling multiple nodes as invoker nodes. We have detailed documentation on using Kubernetes clusters from the following major cloud providers:

We would welcome contributions of documentation for Azure (AKS) and any other public cloud providers.

Using OpenShift

You will need at least 1 worker node with 4GB of memory and 2 virtual CPUs to deploy the default configuration of OpenWhisk. You can deploy to significantly larger clusters by scaling up the replica count of the various components and labeling multiple nodes as invoker nodes. For more detailed documentation, see:

Using a Kubernetes cluster you built yourself

If you are comfortable with building your own Kubernetes clusters and deploying services with ingresses to them, you should also be able to deploy OpenWhisk to a do-it-yourself cluster. Make sure your cluster meets the technical requirements. You will need at least 1 worker node with 4GB of memory and 2 virtual CPUs to deploy the default configuration of OpenWhisk. You can deploy to significantly larger clusters by scaling up the replica count of the various components and labeling multiple nodes as invoker nodes.

Additional more detailed instructions:

Helm

Helm is a tool to simplify the deployment and management of applications on Kubernetes clusters. The OpenWhisk Helm chart requires Helm 3.

Our automated testing currently uses Helm v3.2.4

Follow the Helm install instructions for your platform to install Helm v3.0.1 or newer.

Deploying OpenWhisk

Now that you have your Kubernetes cluster and have installed the Helm 3 CLI, you are ready to deploy OpenWhisk.

Overview

You will use Helm to deploy OpenWhisk to your Kubernetes cluster. There are four deployment steps that are described in more detail below in the rest of this section.

  1. Initial cluster setup. If you have provisioned a multi-node cluster, you should label the worker nodes to indicate their intended usage by OpenWhisk.
  2. Customize the deployment. You will create a mycluster.yaml that specifies key facts about your Kubernetes cluster and the OpenWhisk configuration you wish to deploy. Predefined mycluster.yaml files for common flavors of Kubernetes clusters are provided in the deploy directory.
  3. Deploy OpenWhisk with Helm. You will use Helm and mycluster.yaml to deploy OpenWhisk to your Kubernetes cluster.
  4. Configure the wsk CLI. You need to tell the wsk CLI how to connect to your OpenWhisk deployment.

Initial setup

Single Worker Node Clusters

If your cluster has a single worker node, then you should configure OpenWhisk without node affinity. This is done by adding the following lines to your mycluster.yaml

affinity:
  enabled: false

toleration:
  enabled: false

invoker:
  options: "-Dwhisk.kubernetes.user-pod-node-affinity.enabled=false"

Multi Worker Node Clusters

If you are deploying OpenWhisk to a cluster with multiple worker nodes, we recommend using node affinity to segregate the compute nodes used for the OpenWhisk control plane from those used to execute user functions. Do this by labeling each node with openwhisk-role=invoker. In the default configuration, which uses the KubernetesContainerFactory, the node labels are used in conjunction with Pod affinities to inform the Kubernetes scheduler how to place work so that user actions will not interfere with the OpenWhisk control plane. When using the non-default DockerContainerFactory, OpenWhisk assumes it has exclusive use of these invoker nodes and will schedule work on them directly, completely bypassing the Kubernetes scheduler. For each node <INVOKER_NODE_NAME> you want to be an invoker, execute

kubectl label node <INVOKER_NODE_NAME> openwhisk-role=invoker

If you are targeting OpenShift, use the command

oc label node <INVOKER_NODE_NAME> openwhisk-role=invoker

For more precise control of the placement of the rest of OpenWhisk's pods on a multi-node cluster, you can optionally label additional non-invoker worker nodes. Use the label openwhisk-role=core to indicate nodes which should run the OpenWhisk control plane (the controller, kafka, zookeeeper, and couchdb pods). If you have dedicated Ingress nodes, label them with openwhisk-role=edge. Finally, if you want to run the OpenWhisk Event Providers on specific nodes, label those nodes with openwhisk-role=provider.

If the Kubernetes cluster does not allow you to assign a label to a node, or you cannot use the affinity attribute, you use the yaml snippet shown above in the single worker node configuration to disable the use of affinities by OpenWhisk.

Customize the Deployment

You will need a mycluster.yaml file to record key aspects of your Kubernetes cluster that are needed to configure the deployment of OpenWhisk to your cluster. For details, see the documentation appropriate to your Kubernetes cluster:

Default/template mycluster.yaml for various types of Kubernetes clusets can be found in subdirectories of deploy.

Beyond the basic Kubernetes cluster specific configuration information, the mycluster.yaml file can also be used to customize your OpenWhisk deployment by enabling optional features and controlling the replication factor of the various microservices that make up the OpenWhisk implementation. See the configuration choices documentation for a discussion of the primary options.

Deploy With Helm

For simplicity, in this README, we have used owdev as the release name and openwhisk as the namespace into which the Chart's resources will be deployed. You can use a different name and/or namespace simply by changing the commands used below.

NOTE: The commands below assume Helm v3.2.0 or higher. Verify your local Helm version with the command helm version.

Deploying Released Charts from Helm Repository

The OpenWhisk project maintains a Helm repository at https://openwhisk.apache.org/charts. You may install officially released versions of OpenWhisk from this repository:

helm repo add openwhisk https://openwhisk.apache.org/charts
helm repo update
helm install owdev openwhisk/openwhisk -n openwhisk --create-namespace -f mycluster.yaml

Deploying from Git

To deploy directly from sources, either download the latest source release or git clone https://github.com/apache/openwhisk-deploy-kube.git and use the Helm chart from the helm/openwhisk folder of the source tree.

helm install owdev ./helm/openwhisk -n openwhisk --create-namespace -f mycluster.yaml

Checking status

You can use the command helm status owdev -n openwhisk to get a summary of the various Kubernetes artifacts that make up your OpenWhisk deployment. Once the pod name containing the word install-packages is in the Completed state, your OpenWhisk deployment is ready to be used.

NOTE: You can check the status of the pod by running the following command kubectl get pods -n openwhisk --watch.

Configure the wsk CLI

Configure the OpenWhisk CLI, wsk, by setting the auth and apihost properties (if you don't already have the wsk cli, follow the instructions here to get it). Replace whisk.ingress.apiHostName and whisk.ingress.apiHostPort with the actual values from your mycluster.yaml.

wsk property set --apihost <whisk.ingress.apiHostName>:<whisk.ingress.apiHostPort>
wsk property set --auth 23bc46b1-71f6-4ed5-8c54-816aa4f8c502:123zO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP

Configuring the CLI for Kubernetes on Docker for Mac and Windows

The docker0 network interface does not exist in the Docker for Mac/Windows host environment. Instead, exposed NodePorts are forwarded from localhost to the appropriate containers. This means that you will use localhost instead of whisk.ingress.apiHostName when configuring the wsk cli and replace whisk.ingress.apiHostPort with the actual values from your mycluster.yaml.

wsk property set --apihost localhost:<whisk.ingress.apiHostPort>
wsk property set --auth 23bc46b1-71f6-4ed5-8c54-816aa4f8c502:123zO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP

Verify your OpenWhisk Deployment

Your OpenWhisk installation should now be usable. You can test it by following these instructions to define and invoke a sample OpenWhisk action in your favorite programming language.

You can also issue the command helm test owdev -n openwhisk to run the basic verification test suite included in the OpenWhisk Helm chart.

Note: if you installed self-signed certificates, which is the default for the OpenWhisk Helm chart, you will need to use wsk -i to suppress certificate checking. This works around cannot validate certificate errors from the wsk CLI.

If your deployment is not working, check our troubleshooting guide for ideas.

Scale-up your OpenWhisk Deployment

Using defaults, your deployment is configured to provide a bare-minimum working platform for testing and exploration. For your specialized workloads, you can scale-up your openwhisk deployment by defining your deployment configurations in your mycluster.yaml which overrides the defaults in helm/openwhisk/values.yaml. Some important parameters to consider (for other parameters, check helm/openwhisk/values.yaml and configurationChoices):

  • actionsInvokesPerminute: limits the maximum number of invocations per minute.
  • actionsInvokesConcurrent: limits the maximum concurrent invocations.
  • containerPool: total memory available per invoker instance. Invoker uses this memory to create containers for user-actions. The concurrency-limit (actions running in parallel) will depend upon the total memory configured for containerPool and memory allocated per action (default: 256mb per container).

For more information about increasing concurrency-limit, check scaling-up your deployment.

Administering OpenWhisk

Wskadmin is the tool to perform various administrative operations against an OpenWhisk deployment.

Since wskadmin requires credentials for direct access to the database (that is not normally accessible to the outside), it is deployed in a pod inside Kubernetes that is configured with the proper parameters. You can run wskadmin with kubectl. You need to use the <namespace> and the deployment <name> that you configured with --namespace and --name when deploying.

You can then invoke wskadmin with:

kubectl -n <namespace> -ti exec <name>-wskadmin -- wskadmin <parameters>

For example, is your deployment name is owdev and the namespace is openwhisk you can list users in the guest namespace with:

$ kubectl -n openwhisk  -ti exec owdev-wskadmin -- wskadmin user list guest
23bc46b1-71f6-4ed5-8c54-816aa4f8c502:123zO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP

Check here for details about the available commands.

Development and Testing OpenWhisk on Kubernetes

This section outlines how common OpenWhisk development tasks are supported when OpenWhisk is deployed on Kubernetes using Helm.

Running OpenWhisk test cases

Some key differences in a Kubernetes-based deployment of OpenWhisk are that deploying the system does not generate a whisk.properties file and that the various internal microservices (invoker, controller, etc.) are not directly accessible from the outside of the Kubernetes cluster. Therefore, although you can run full system tests against a Kubernetes-based deployment by giving some extra command line arguments, any unit tests that assume direct access to one of the internal microservices will fail. First clone the core OpenWhisk repository locally and set $OPENWHISK_HOME to its top-level directory. Then, the system tests can be executed in a batch-style as shown below, where WHISK_SERVER and WHISK_AUTH are replaced by the values returned by wsk property get --apihost and wsk property get --auth respectively.

cd $OPENWHISK_HOME
./gradlew :tests:testSystemKCF -Dwhisk.auth=$WHISK_AUTH -Dwhisk.server=https://$WHISK_SERVER -Dopenwhisk.home=`pwd`

You can also launch the system tests as JUnit test from an IDE by adding the same system properties to the JVM command line used to launch the tests:

 -Dwhisk.auth=$WHISK_AUTH -Dwhisk.server=https://$WHISK_SERVER -Dopenwhisk.home=`pwd`

NOTE: You need to install JDK 8 in order to run these tests.

Deploying a locally built docker image.

If you are using Kubernetes in Docker, it is straightforward to deploy local images by adding a stanza to your mycluster.yaml. For example, to use a locally built controller image, just add the stanza below to your mycluster.yaml to override the default behavior of pulling a stable openwhisk/controller image from Docker Hub.

controller:
  imageName: "whisk/controller"
  imageTag: "latest"

Selectively redeploying using a locally built docker image

You can use the helm upgrade command to selectively redeploy one or more OpenWhisk components. Continuing the example above, if you make additional changes to the controller source code and want to just redeploy it without redeploying the entire OpenWhisk system you can do the following:

If you are using a multi-node Kubernetes cluster you will need to repeat the following steps on all nodes that may run the controller component.

The first step is to rebuild the docker image:

# Execute this command in your openwhisk directory
bin/wskdev controller -b

Note that the wskdev flags -x and -d are not compatible with the Kubernetes deployment of OpenWhisk.

Alternatively, you can build all of the OpenWhisk docker components:

# Execute this command in your openwhisk directory
./gradlew distDocker

After building the new docker image(s), tag the new image:

# Tag the docker image you seek to redeploy
docker tag whisk/controller whisk/controller:v2

Then, edit your mycluster.yaml to contain:

controller:
  imageName: "whisk/controller"
  imageTag: "v2"

Redeploy with Helm by executing this command in your openwhisk-deploy-kube directory:

helm upgrade owdev ./helm/openwhisk -n openwhisk -f mycluster.yaml

Deploying Lean Openwhisk version.

To have a lean setup (no Kafka, Zookeeper and no Invokers as separate entities):

controller:
  lean: true

Cleanup

Use the following command to remove all the deployed OpenWhisk components:

helm uninstall owdev -n openwhisk

By default, helm uninstall removes the history of previous deployments. If you want to keep the history, add the command line flag --keep-history.

Issues

If your OpenWhisk deployment is not working, check our troubleshooting guide for ideas.

Report bugs, ask questions and request features here on GitHub.

You can also join our slack channel and chat with developers. To get access to our slack channel, request an invite here.

openwhisk-deploy-kube's People

Contributors

23doors avatar akrabat avatar andreialecu avatar bbrowning avatar buggtb avatar cduque89 avatar csantanapr avatar dgrove-oss avatar hunhoffe avatar linuswagner avatar mjnovice avatar mlangbehn avatar mrutkows avatar naohirotamura avatar neeraj-laad avatar neerajmangal avatar ningyougang avatar otaviof avatar paul42 avatar ploecker avatar rabbah avatar rahtr avatar rdiaz82 avatar sciabarracom avatar shawnallen85 avatar stefan-esquivel avatar style95 avatar tobias avatar upgle avatar vmasdani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openwhisk-deploy-kube's Issues

issue about couchdb

when I run "ansible-playbook -i environments/kube couchdb.yml", the error is

**TASK [couchdb : wait until the CouchDB in this host is up and running] ************************************************************************
Wednesday 21 June 2017 01:28:22 +0000 (0:00:00.681) 0:00:01.859 ********
fatal: [127.0.0.1]: FAILED! => {"changed": false, "elapsed": 60, "failed": true, "msg": "Timeout when waiting for couchdb.openwhisk:5984"}

PLAY RECAP ************************************************************************************************************************************
127.0.0.1 : ok=2 changed=1 unreachable=0 failed=1**

Failed to create api for a java web action

Here are the key steps I used:

$ ./wsk action invoke --result hello --param name Terry --insecure
{
"greeting": "Hello Terry!"
}

$ ./wsk action update hello --web true --insecure
ok: updated action hello

$ ./wsk api create /api-demo/v1 /hello get hello --insecure --response-type json
error: Unable to create API: The requested resource does not exist. (code 22460)

CouchDB should not need Ansible for configuration

Ideally we need to be able to remove Ansible configuration that are used to setup CouchDB.

  • Generate a Docker image with an already initialized database.
  • The Docker image should have an init script that edits the authentication for unique credentials. Also edit the entries within the database with unique credentials

controller not starting, configuration error

Going through the installation process, I see that the controllers pods are not starting properly, complaining about a misconfiguration.

[2017-09-26T16:36:36.971Z] [ERROR] [??] [Config] required property controller.blackboxFraction still not set 
[2017-09-26T16:36:36.971Z] [ERROR] [??] [Controller] Bad configuration, cannot start. 

It impacts also nginx which appears to rely on the controller service.

All the configurations are the ones coming from the GitHub repo, nothing changed so far.

Deploying in Kubernetes 1.7.2 in an actual cluster deployed with Rancher (not Minikube nor local).

Openwhisk fails after cluster reboot due to no persistence

I'm running a Kubernetes cluster in Vagrant for testing and noticed that on reboot, openwhisk fails because the couchdb and consul data is not persisted.

I've already made the required changes in my fork and it's all working.
PR incoming.

invoker-0 stuck in CrashLoopBackOff loop probably due to issues creating a kafka topic

I have started all the openwhisk services, but the invoker appears to have issues creating a topic in kafka and then issues a JVM shutdown hook. The relevant errors from invoker logs appears to be the following, which originates from invoker.scala

[2017-12-01T15:01:42.512Z] [ERROR] [??] [KafkaMessagingProvider] exception during creation of topic invoker0
[2017-12-01T15:01:42.514Z] [ERROR] [#sid_100] [Invoker] failure during msgProvider.ensureTopic for topic invoker0
[INFO] [12/01/2017 15:01:42.604] [kamon-shutdown-hook-1] [CoordinatedShutdown(akka://kamon)] Starting coordinated shutdown from JVM shutdown hook

I have also included the relevant pods output as well as the full logs from invoker and relevant logs from kafka in case I missed some output that might help debug things.

-> % kubectl get pods -n openwhisk                   
NAME                         READY     STATUS             RESTARTS   AGE
apigateway-75f6c5fdc-jrp8n   1/1       Running            0          15h
controller-0                 0/1       CrashLoopBackOff   75         15h
controller-1                 0/1       CrashLoopBackOff   75         15h
couchdb-5b6c59c7cc-hgf2z     1/1       Running            0          15h
invoker-0                    0/1       CrashLoopBackOff   32         15h
kafka-797ff4c999-9tjb6       1/1       Running            0          15h
nginx-88b764487-9bvls        1/1       Running            0          15h
redis-7fb77bfc4-n5h8b        1/1       Running            0          15h
zookeeper-575965f666-4wgwj   1/1       Running            0          15
-> % kubectl -n openwhisk logs invoker-0
[2017-12-01T14:59:38.893Z] [INFO] Initializing Kamon...
[INFO] [12/01/2017 14:59:39.358] [main] [StatsDExtension(akka://kamon)] Starting the Kamon(StatsD) extension
[2017-12-01T14:59:39.506Z] [INFO] Slf4jLogger started
[2017-12-01T14:59:39.717Z] [INFO] [??] [Config] environment set value for db.whisk.actions
[2017-12-01T14:59:39.805Z] [INFO] [??] [Config] environment set value for db.protocol
[2017-12-01T14:59:39.817Z] [INFO] [??] [Config] environment set value for docker.image.prefix
[2017-12-01T14:59:39.839Z] [INFO] [??] [Config] environment set value for invoker.container.network
[2017-12-01T14:59:39.841Z] [INFO] [??] [Config] environment set value for whisk.api.host.name
[2017-12-01T14:59:39.843Z] [INFO] [??] [Config] environment set value for db.port
[2017-12-01T14:59:39.843Z] [INFO] [??] [Config] environment set value for db.whisk.activations.filter.ddoc
[2017-12-01T14:59:39.843Z] [INFO] [??] [Config] environment set value for db.username
[2017-12-01T14:59:39.843Z] [INFO] [??] [Config] environment set value for db.whisk.activations
[2017-12-01T14:59:39.844Z] [INFO] [??] [Config] environment set value for docker.registry
[2017-12-01T14:59:39.847Z] [INFO] [??] [Config] environment set value for db.whisk.actions.ddoc
[2017-12-01T14:59:39.872Z] [INFO] [??] [Config] environment set value for invoker.name
[2017-12-01T14:59:39.872Z] [INFO] [??] [Config] environment set value for docker.image.tag
[2017-12-01T14:59:39.873Z] [INFO] [??] [Config] environment set value for invoker.use.runc
[2017-12-01T14:59:39.873Z] [INFO] [??] [Config] environment set value for db.whisk.activations.ddoc
[2017-12-01T14:59:39.900Z] [INFO] [??] [Config] environment set value for zookeeper.hosts
[2017-12-01T14:59:39.900Z] [INFO] [??] [Config] environment set value for runtimes.manifest
[2017-12-01T14:59:39.953Z] [INFO] [??] [Config] environment set value for kafka.hosts
[2017-12-01T14:59:39.953Z] [INFO] [??] [Config] environment set value for db.host
[2017-12-01T14:59:39.953Z] [INFO] [??] [Config] environment set value for port
[2017-12-01T14:59:39.954Z] [INFO] [??] [Config] environment set value for db.provider
[2017-12-01T14:59:39.954Z] [INFO] [??] [Config] environment set value for db.password
[2017-12-01T14:59:41.038Z] [INFO] [??] [Invoker] Command line arguments parsed to yield CmdLineArgs(None,Some(0))
[2017-12-01T14:59:41.160Z] [INFO] [??] [Invoker] invokerReg: using proposedInvokerId 0
[2017-12-01T15:01:42.512Z] [ERROR] [??] [KafkaMessagingProvider] exception during creation of topic invoker0
[2017-12-01T15:01:42.514Z] [ERROR] [#sid_100] [Invoker] failure during msgProvider.ensureTopic for topic invoker0
[INFO] [12/01/2017 15:01:42.604] [kamon-shutdown-hook-1] [CoordinatedShutdown(akka://kamon)] Starting coordinated shutdown from JVM shutdown hook
kafka is up and running
+ echo 'Create health topic'
Create health topic
++ kafka-topics.sh --create --topic health --replication-factor 1 --partitions 1 --zookeeper zookeeper.openwhisk:2181 --config retention.bytes=536870912 --config retention.ms=1073741824 --config segment.bytes=3600000
[2017-11-30 23:08:26,065] INFO Creating /controller (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2017-11-30 23:08:26,113] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
[2017-11-30 23:08:26,114] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector)
[2017-11-30 23:08:26,498] INFO [ExpirationReaper-0], Starting  (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2017-11-30 23:08:26,504] INFO [ExpirationReaper-0], Starting  (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2017-11-30 23:08:26,508] INFO [ExpirationReaper-0], Starting  (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2017-11-30 23:08:26,571] INFO [GroupCoordinator 0]: Starting up. (kafka.coordinator.GroupCoordinator)
[2017-11-30 23:08:26,579] INFO [GroupCoordinator 0]: Startup complete. (kafka.coordinator.GroupCoordinator)
[2017-11-30 23:08:26,592] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 13 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-11-30 23:08:26,690] INFO Will not load MX4J, mx4j-tools.jar is not in the classpath (kafka.utils.Mx4jLoader$)
[2017-11-30 23:08:26,814] INFO Creating /brokers/ids/0 (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2017-11-30 23:08:26,830] INFO New leader is 0 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2017-11-30 23:08:26,832] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
[2017-11-30 23:08:26,834] INFO Registered broker 0 at path /brokers/ids/0 with addresses: EndPoint(kafka.openwhisk,9092,ListenerName(PLAINTEXT),PLAINTEXT) (kafka.utils.ZkUtils)
[2017-11-30 23:08:26,835] WARN No meta.properties file under dir /data/meta.properties (kafka.server.BrokerMetadataCheckpoint)
[2017-11-30 23:08:26,927] INFO Kafka version : 0.10.2.1 (org.apache.kafka.common.utils.AppInfoParser)
[2017-11-30 23:08:26,927] INFO Kafka commitId : e89bffd6b2eff799 (org.apache.kafka.common.utils.AppInfoParser)
[2017-11-30 23:08:26,931] INFO [Kafka Server 0], started (kafka.server.KafkaServer)
+ OUTPUT='Created topic "health".'
+ [[ Created topic "health". == *\a\l\r\e\a\d\y\ \e\x\i\s\t\s* ]]
+ [[ Created topic "health". == *\C\r\e\a\t\e\d\ \t\o\p\i\c* ]]
+ fg
/start.sh
[2017-11-30 23:18:26,585] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 3 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-11-30 23:28:26,581] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-11-30 23:38:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-11-30 23:48:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-11-30 23:58:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 00:08:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 00:18:26,580] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 00:28:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 00:38:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 00:48:26,579] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 1 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 00:58:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 01:08:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 01:18:26,579] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 01:28:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 01:38:26,577] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 01:48:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 01:58:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 02:08:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 02:18:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-12-01 02:28:26,578] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)

Invoker is not deployed

Hi,
I want to deploy OpenWhisk on Kubernetes (v1.5.6) by using this code.
I found that when I used default setting (danlavine's whisk_config:v1.5.6 image) and execute "kubectl apply -f configure/configure_whisk.yml", invoker was not installed & configured.

I checked job, and found that "kubectl apply -f environments/kube/files/invoker-service.yml" was never called. I doubt that Installing & configuring invoker was not executed.

[Log (snipet)]

$kubectl -n openwhisk logs -f configure-openwhisk-n639f
+ cp -R /openwhisk/ansible/ /incubator-openwhisk-deploy-kube/ansible
+ cp -R /openwhisk/tools/ /incubator-openwhisk-deploy-kube/tools
+ cp -R /openwhisk/bin/ /incubator-openwhisk-deploy-kube/bin
+ mkdir -p /incubator-openwhisk-deploy-kube/core
+ cp -R /openwhisk/core/routemgmt /incubator-openwhisk-deploy-kube/core/routemgmt
+ cp -R /incubator-openwhisk-deploy-kube/ansible-kube/. /incubator-openwhisk-deploy-kube/ansible/
+ pushd /incubator-openwhisk-deploy-kube/ansible
/incubator-openwhisk-deploy-kube/ansible /
+ ansible-playbook -i environments/kube setup.yml
+ kubectl proxy -p 8001
Starting to serve on 127.0.0.1:8001
PLAY [localhost] ***************************************************************
:
Tuesday 11 July 2017  11:15:53 +0000 (0:00:01.119)       0:00:11.520 ********** 
=============================================================================== 
Gathering Facts --------------------------------------------------------- 9.16s
gen untrusted certificate for host -------------------------------------- 1.12s
prepare db_local.ini ---------------------------------------------------- 0.62s
check if db_local.ini exists? ------------------------------------------- 0.39s
find the ip of docker-machine ------------------------------------------- 0.05s
get the docker-machine ip ----------------------------------------------- 0.05s
gen hosts for docker-machine -------------------------------------------- 0.05s
add new db host on-the-fly ---------------------------------------------- 0.05s
+ kubectl apply -f environments/kube/files/db-service.yml
service "couchdb" created
+ kubectl apply -f environments/kube/files/consul-service.yml
service "consul" created
+ kubectl apply -f environments/kube/files/zookeeper-service.yml
service "zookeeper" created
+ kubectl apply -f environments/kube/files/kafka-service.yml
service "kafka" created
+ kubectl apply -f environments/kube/files/controller-service.yml
service "controller" created
+ deployCouchDB
++ kubectl -n openwhisk get pods --show-all
++ grep couchdb
++ grep 1/1
+ COUCH_DEPLOYED=
+ '[' -z '' ']'
+ return 0
+ ansible-playbook -i environments/kube couchdb.yml
:

[Result]

$kubectl -n openwhisk get pods --show-all=true
NAME                          READY     STATUS      RESTARTS   AGE
configure-openwhisk-n639f     0/1       Completed   0          7m
consul-57995027-gtkfr         2/2       Running     0          5m
controller-3250411552-jxnrd   1/1       Running     0          4m
couchdb-109298327-0h6c6       1/1       Running     0          6m
kafka-1060962555-62sx5        1/1       Running     0          4m
zookeeper-1304892743-w5vwj    1/1       Running     0          4m

On the other hand, I built my own customed image and run "kubectl apply -f environments/kube/files/invoker-service.yml", everything was finished successfully and invoker was created in the end.

[Log (snipet)]

$kubectl -n openwhisk logs -f configure-openwhisk-s520q
++ awk '{print $2}'
++ grep replicas:
++ cat /incubator-openwhisk-deploy-kube/ansible-kube/environments/kube/files/invoker.yml
+ INVOKER_REP_COUNT=1
+ INVOKER_COUNT=1
+ sed -ie s/REPLACE_INVOKER_COUNT/1/g /incubator-openwhisk-deploy-kube/ansible-kube/environments/kube/group_vars/all
+ cp -R /openwhisk/ansible/ /incubator-openwhisk-deploy-kube/ansible
+ cp -R /openwhisk/tools/ /incubator-openwhisk-deploy-kube/tools
+ cp -R /openwhisk/bin/ /incubator-openwhisk-deploy-kube/bin
+ mkdir -p /incubator-openwhisk-deploy-kube/core
+ cp -R /openwhisk/core/routemgmt /incubator-openwhisk-deploy-kube/core/routemgmt
+ cp -R /incubator-openwhisk-deploy-kube/ansible-kube/. /incubator-openwhisk-deploy-kube/ansible/
+ pushd /incubator-openwhisk-deploy-kube/ansible
+ ansible-playbook -i environments/kube setup.yml
/incubator-openwhisk-deploy-kube/ansible /
+ kubectl proxy -p 8001
Starting to serve on 127.0.0.1:8001
PLAY [localhost] ***************************************************************
:
Wednesday 12 July 2017  02:12:14 +0000 (0:00:00.985)       0:00:10.246 ******** 
=============================================================================== 
Gathering Facts --------------------------------------------------------- 8.15s
gen untrusted certificate for host -------------------------------------- 0.99s
prepare db_local.ini ---------------------------------------------------- 0.56s
check if db_local.ini exists? ------------------------------------------- 0.35s
find the ip of docker-machine ------------------------------------------- 0.04s
get the docker-machine ip ----------------------------------------------- 0.04s
gen hosts for docker-machine -------------------------------------------- 0.04s
add new db host on-the-fly ---------------------------------------------- 0.04s
+ kubectl apply -f environments/kube/files/db-service.yml
service "couchdb" created
+ kubectl apply -f environments/kube/files/consul-service.yml
service "consul" created
+ kubectl apply -f environments/kube/files/zookeeper-service.yml
service "zookeeper" created
+ kubectl apply -f environments/kube/files/kafka-service.yml
service "kafka" created
+ kubectl apply -f environments/kube/files/controller-service.yml
service "controller" created
+ kubectl apply -f environments/kube/files/invoker-service.yml
service "invoker" created
+ deployCouchDB
++ kubectl -n openwhisk get pods --show-all
++ grep 1/1
++ grep couchdb
+ COUCH_DEPLOYED=
+ '[' -z '' ']'
+ return 0
+ ansible-playbook -i environments/kube couchdb.yml

PLAY [db] **********************************************************************
:

[Result]

$kubectl -n openwhisk get pods --show-all=true
NAME                          READY     STATUS      RESTARTS   AGE
configure-openwhisk-3w4gv     0/1       Completed   0          6h
consul-57995027-8gsvd         2/2       Running     0          6h
controller-3250411552-j3fvk   1/1       Running     0          6h
couchdb-109298327-kbr26       1/1       Running     0          6h
invoker-0                     1/1       Running     0          6h
kafka-1060962555-f71g9        1/1       Running     0          6h
zookeeper-1304892743-03ksc    1/1       Running     0          6h

At this time, do we have to create own customized image to build OpenWhisk environment on Kubernetes?

Regards,

Issues deploying with custom docker images

Summary

Deploying with custom docker images fails on routemgmt : install route management actions.

Steps to reproduce

Modify configuration to use new images:

  • ansible-kube/environments/kube/files/nginx.yml
    34       containers:
    35       - name: nginx
    36         imagePullPolicy: Always
    37         image: [account_name]/whisk_nginx
    38         ports:
  • configure/configure_whisk.yml
    17       containers:
    18       - name: configure-openwhisk
    19         image: [account_name]/whisk_config:v1.5.6-dev
    20         imagePullPolicy: Always

Log snippet

Verbose logging enabled for ansible

TASK [routemgmt : install route management actions] ****************************
task path: /incubator-openwhisk-deploy-kube/ansible/roles/routemgmt/tasks/deploy.yml:3
Monday 19 June 2017  21:16:38 +0000 (0:00:08.756)       0:05:33.797 *********** 
fatal: [ansible]: FAILED! => {
    "failed": true, 
    "msg": "the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'cli_path' is undefined\n\nThe error appears to have been in '/incubator-openwhisk-deploy-kube/ansible/roles/routemgmt/tasks/deploy.yml': line 3, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n# Install the API Gateway route management actions.\n- name: install route management actions\n  ^ here\n"
}

Nginx should not need Ansible for configuration

Ideally we need to be able to remove Ansible configuration that are used to setup Consul.

  • Build Nginx with all OpenWhisk specific requirements (wsk, blackbox) pre-built into the Docker image.
  • Script to helps generate certificates
  • create a Kube ConfigMap and/or Secrets Resource from those certs and a static Nginx.conf file. Where this nginx.conf file is specific to an environment
  • Have yaml file(s) for the Kube Deployment and Service which uses the generated ConfigMap
  • Nginx is configured via a yaml file with documentation
  • Nginx Docker image is built via the OpenWhisk CI and is not a custom image

OpenWhisk should be deployed all through yaml

A number of components will need to be configured so they are more "Dockerized" to remove the need for Ansible. General implementations should be able to use:

  • Have Kube YAML files as the primary source of configuration
  • Utilize ENVIRONMENT vars
  • โ€œbakeโ€ other configs into Docker images
  • Determine if we need to do better health checking to determine if we should start. Do not fail to run a process

Components to be removed:

The original proposal for this work can be found here

issue with invoker

when trying to deploy, I run into this issue with the invoker.

Failed to start container with id 8d9125bf2d3711312a98a8b98de15306e495883cc470a03beb6689b34895791f with error: rpc error: code = 2 desc = failed to start container "8d9125bf2d3711312a98a8b98de15306e495883cc470a03beb6689b34895791f": Error response from daemon: {"message":"mkdir /usr/lib/x86_64-linux-gnu: read-only file system"}
Error syncing pod, skipping: failed to "StartContainer" for "invoker" with rpc error: code = 2 desc = failed to start container "8d9125bf2d3711312a98a8b98de15306e495883cc470a03beb6689b34895791f": Error response from daemon: {"message":"mkdir /usr/lib/x86_64-linux-gnu: read-only file system"}: "Start Container Failed"

on the logs of the pod in k8s I see the following

failed to open log file "/var/log/pods/7fce67a4-4d69-11e7-a736-0242ac110003/invoker_41.log": open /var/log/pods/7fce67a4-4d69-11e7-a736-0242ac110003/invoker_41.log: no such file or directory

I checked if the host drive was filling up, but everything is all good there.

I run k8s 1.6.2 on CoreOs 1353.7.
I edited the configure_whisk.yml to use the 1.6.2 version.

      - name: configure-openwhisk
        image: danlavine/whisk_config:v1.6.2
        imagePullPolicy: Always

I tried 3 times the deployment with the same result.

Install Route Management from the script

Ideally we need to be able to remove Ansible configuration that are used to setup RouteManagement. To do this, we should be able to run the script by hand when it is pointed at an OpenWhisk deployment.

Controller should not need Ansible for configuration

Ideally we need to be able to remove Ansible configuration that are used to setup the Controller.

  • Provide the ability for the controller to receive updates that new invoker instances are able to be used.
    • This already happens by default. Currently Kafka can receive new topics to automatically be created and used.
  • Need to make sure we use stateful sets so controller has unique names.
  • [] Controller should have better retry logic when trying to connect to CouchDB. The process should not crash if it cannot connect

Nginx can not forward the following requests to controller once the major controller is restarted.

When deploy with multiple controllers in the cluster, the nginx node is configured as load-balance mode. e.g. if we deploy two controllers, there would be controller-0 and controller-1 in the cluster, controller-0 would be the major controller node and controller-1 is the stand by controller.

In this case, if we kill controller-0, K8s will auto restart this pod, but after this controller instance is restarted, the nginx stopped to forward client request to controller pod, therefore, the client can not reach out to whisk cluster.

/cc @dgrove-oss

issue deploy openwhisk on kubernetes

when I run "ansible-playbook -i environments/kube setup.yml", the error is as below:

fatal: [localhost]: FAILED! => {"failed": true, "msg": "The conditional check 'nginx.ssl.cert == "openwhisk-cert.pem"' failed. The error was: error while evaluating conditional (nginx.ssl.cert == "openwhisk-cert.pem"): {u'confdir': u'{{ config_root_dir }}/nginx', u'ssl': {u'path': u'{{ openwhisk_home }}/ansible/roles/nginx/files', u'cert': u'openwhisk-cert.pem', u'password_enabled': False, u'key': u'openwhisk-key.pem', u'password_file': u'ssl.pass'}, u'version': 1.11, u'port': {u'adminportal': 8443, u'api': 443, u'http': 80}}: 'config_root_dir' is undefined\n\nThe error appears to have been in '/home/h3c/zhy/git/serverless/wsk/openwhisk_k8s/incubator-openwhisk-deploy-kube/ansible/setup.yml': line 33, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n - name: gen untrusted certificate for host\n ^ here\n"}

Adjust build script to assume git clone of "incubator-openwhisk"

Looking at the build scripts setting up the OPENWHISK_DIR env. var. from issue:
#6

It seemed wrong to default to a directory for openwhisk that did not reflect a "git clone" which would place the code in an /incubator-openwhisk/... path.

In addition, we need to be aware that the OW CLI will be moving out of incubator-openwhisk this month into its own repo (with plug-in Go client) here:
https://github.com/apache/incubator-openwhisk-cli

Perhaps we need a means to establish a consistent "environment" based upon openstack repo. paths that all scripts could use?

Failed to connect to couchdb.openwhisk port 5984: Connection timed out

Here is the log tail when tried to bring CouchDB on K8S.

`+ ansible-playbook -i environments/local setup.yml -e db_host=couchdb.openwhisk -e db_prefix=test_ -e db_username=whisk_admin -e db_password=some_passw0rd -e db_port=5984 -e openwhisk_home=/openwhisk
CouchDB is up and running
/openwhisk/ansible /openwhisk /var/lib/couchdb

PLAY [localhost] ***************************************************************

TASK [Gathering Facts] *********************************************************
Monday 11 September 2017 18:31:50 +0000 (0:00:00.024) 0:00:00.024 ******
ok: [localhost]

TASK [find the ip of docker-machine] *******************************************
Monday 11 September 2017 18:31:50 +0000 (0:00:00.548) 0:00:00.572 ******
skipping: [localhost]

TASK [get the docker-machine ip] ***********************************************
Monday 11 September 2017 18:31:50 +0000 (0:00:00.027) 0:00:00.600 ******
skipping: [localhost]

TASK [gen hosts for docker-machine] ********************************************
Monday 11 September 2017 18:31:50 +0000 (0:00:00.032) 0:00:00.633 ******
skipping: [localhost]

TASK [add new db host on-the-fly] **********************************************
Monday 11 September 2017 18:31:50 +0000 (0:00:00.029) 0:00:00.662 ******
skipping: [localhost]

TASK [check if db_local.ini exists?] *******************************************
Monday 11 September 2017 18:31:50 +0000 (0:00:00.025) 0:00:00.688 ******
ok: [localhost]

TASK [prepare db_local.ini] ****************************************************
Monday 11 September 2017 18:31:51 +0000 (0:00:00.267) 0:00:00.955 ******
changed: [localhost -> localhost]

TASK [gen untrusted server certificate for host] *******************************
Monday 11 September 2017 18:31:51 +0000 (0:00:00.437) 0:00:01.392 ******
changed: [localhost -> localhost]

TASK [gen untrusted client certificate for host] *******************************
Monday 11 September 2017 18:31:51 +0000 (0:00:00.320) 0:00:01.713 ******
changed: [localhost -> localhost]

PLAY RECAP *********************************************************************
localhost : ok=5 changed=3 unreachable=0 failed=0

Monday 11 September 2017 18:31:52 +0000 (0:00:00.394) 0:00:02.108 ******

Gathering Facts --------------------------------------------------------- 0.55s
prepare db_local.ini ---------------------------------------------------- 0.44s
gen untrusted client certificate for host ------------------------------- 0.39s
gen untrusted server certificate for host ------------------------------- 0.32s
check if db_local.ini exists? ------------------------------------------- 0.27s
get the docker-machine ip ----------------------------------------------- 0.03s
gen hosts for docker-machine -------------------------------------------- 0.03s
find the ip of docker-machine ------------------------------------------- 0.03s
add new db host on-the-fly ---------------------------------------------- 0.03s
/openwhisk /var/lib/couchdb

`

controller and invoker don't start, log show "Bad configuration, cannot start."

I followed the instruction to deploy openwhisk on minikube, But the Controller and Invoker didn't start.
The log of Controller:

[2017-11-15T05:33:36.542Z] [INFO] Initializing Kamon...
[INFO] [11/15/2017 05:33:36.807] [main] [StatsDExtension(akka://kamon)] Starting the Kamon(StatsD) extension
[2017-11-15T05:33:36.868Z] [INFO] Slf4jLogger started
[2017-11-15T05:33:37.154Z] [INFO] [??] [Config] environment set value for db.whisk.actions
[2017-11-15T05:33:37.155Z] [INFO] [??] [Config] environment set value for limits.actions.sequence.maxLength
[2017-11-15T05:33:37.156Z] [INFO] [??] [Config] environment set value for limits.triggers.fires.perMinute
[2017-11-15T05:33:37.156Z] [INFO] [??] [Config] environment set value for db.protocol
[2017-11-15T05:33:37.156Z] [INFO] [??] [Config] environment set value for akka.cluster.seed.nodes
[2017-11-15T05:33:37.157Z] [INFO] [??] [Config] environment set value for loadbalancer.invokerBusyThreshold
[2017-11-15T05:33:37.157Z] [INFO] [??] [Config] environment set value for controller.instances
[2017-11-15T05:33:37.158Z] [INFO] [??] [Config] environment set value for limits.actions.invokes.concurrent
[2017-11-15T05:33:37.158Z] [INFO] [??] [Config] environment set value for whisk.version.date
[2017-11-15T05:33:37.159Z] [INFO] [??] [Config] environment set value for controller.localBookkeeping
[2017-11-15T05:33:37.159Z] [INFO] [??] [Config] environment set value for db.port
[2017-11-15T05:33:37.159Z] [INFO] [??] [Config] environment set value for whisk.version.buildno
[2017-11-15T05:33:37.160Z] [INFO] [??] [Config] environment set value for db.username
[2017-11-15T05:33:37.160Z] [INFO] [??] [Config] environment set value for db.whisk.activations
[2017-11-15T05:33:37.161Z] [INFO] [??] [Config] environment set value for limits.actions.invokes.perMinute
[2017-11-15T05:33:37.161Z] [INFO] [??] [Config] environment set value for db.whisk.auths
[2017-11-15T05:33:37.161Z] [INFO] [??] [Config] environment set value for runtimes.manifest
[2017-11-15T05:33:37.162Z] [INFO] [??] [Config] environment set value for kafka.host.port
[2017-11-15T05:33:37.166Z] [INFO] [??] [Config] environment set value for limits.actions.invokes.concurrentInSystem
[2017-11-15T05:33:37.166Z] [INFO] [??] [Config] environment set value for controller.blackboxFraction
[2017-11-15T05:33:37.166Z] [INFO] [??] [Config] environment set value for db.host
[2017-11-15T05:33:37.167Z] [INFO] [??] [Config] environment set value for port
[2017-11-15T05:33:37.167Z] [INFO] [??] [Config] environment set value for db.provider
[2017-11-15T05:33:37.167Z] [INFO] [??] [Config] environment set value for db.password
[2017-11-15T05:33:37.167Z] [INFO] [??] [Config] environment set value for kafka.host
[2017-11-15T05:33:37.173Z] [ERROR] [??] [Config] required property db.whisk.actions.ddoc still not set
[2017-11-15T05:33:37.174Z] [ERROR] [??] [Controller] Bad configuration, cannot start.
[INFO] [11/15/2017 05:33:37.196] [Thread-0] [CoordinatedShutdown(akka://kamon)] Starting coordinated shutdown from JVM shutdown hook

CI occasionally flakes out

Every once and a while CI will fail with:

error: error validating "configure/openwhisk_kube_namespace.yml": error validating data: open /home/travis/.kube/schema/v1.5.6/api/v1/schema.json: permission denied; if you choose to ignore these errors, turn validation off with --validate=false

For whatever reason we cannot open the file /home/travis/.kube/schema/v1.5.6/api/v1/schema.json because we do not have permissions. Initially this file is created via a sudo command when setting up Kubernetes, but I thought that I had put in a fix for this here

"wsk api list" is not working

I already created an java action, and it's ok to call it from command line.
now I'm trying ApiGateway, but I got below error msg:

./wsk api list
error: Unable to obtain the API list: The requested resource does not exist. (code 22528)

I checked log of controller

[2017-12-01T09:06:06.755Z] [INFO] [#tid_22154] [CouchDbRestStore] [GET] 'test_whisks', document: 'id: whisk.system/apimgmt/getApi'; not found. [marker:database_getDocument_finish:34:3]
[2017-12-01T09:06:06.756Z] [INFO] [#tid_22154] [WhiskActionMetaData] invalidating CacheKey(whisk.system/apimgmt/getApi)
[2017-12-01T09:06:06.756Z] [INFO] [#tid_22154] [BasicHttpService] [marker:http_get.404_count:35:35]

I guess something is missing in couchDB, but I don't know how to fix it .

Unable to connect to zookeeper.openwhisk:2181

Hi,

It seems zookeeper is deployed but kafka is failing to connect.

$ kubectl get pods,services --all-namespaces=true -o wide
NAMESPACE   NAME                                   READY     STATUS             RESTARTS   AGE       IP           NODE
default     po/kube-apiserver-127.0.0.1            1/1       Running            0          9h        127.0.0.1    127.0.0.1
default     po/kube-controller-manager-127.0.0.1   1/1       Running            0          9h        127.0.0.1    127.0.0.1
default     po/kube-scheduler-127.0.0.1            1/1       Running            0          9h        127.0.0.1    127.0.0.1
openwhisk   po/apigateway-3311085443-mbc4c         1/1       Running            0          3h        172.17.0.4   127.0.0.1
openwhisk   po/controller-0                        0/1       CrashLoopBackOff   79         3h        172.17.0.7   127.0.0.1
openwhisk   po/controller-1                        0/1       CrashLoopBackOff   79         3h        172.17.0.8   127.0.0.1
openwhisk   po/couchdb-710020075-gfqdf             1/1       Running            0          8h        172.17.0.2   127.0.0.1
openwhisk   po/kafka-1473765933-v4p3c              0/1       CrashLoopBackOff   9          25m       172.17.0.6   127.0.0.1
openwhisk   po/redis-1106165648-zgxcs              1/1       Running            0          3h        172.17.0.3   127.0.0.1
openwhisk   po/zookeeper-1304892743-0zw8h          1/1       Running            0          29m       172.17.0.5   127.0.0.1

NAMESPACE   NAME             CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE       SELECTOR
default     svc/kubernetes   10.254.0.1      <none>        443/TCP                      9h        <none>
openwhisk   svc/apigateway   10.254.90.227   <none>        8080/TCP,9000/TCP            3h        name=apigateway
openwhisk   svc/controller   None            <none>        8080/TCP                     3h        name=controller
openwhisk   svc/couchdb      10.254.249.61   <none>        5984/TCP                     8h        name=couchdb
openwhisk   svc/kafka        10.254.67.71    <none>        9092/TCP                     28m       name=kafka
openwhisk   svc/redis        10.254.10.167   <none>        6379/TCP                     3h        name=redis
openwhisk   svc/zookeeper    10.254.144.28   <none>        2181/TCP,2888/TCP,3888/TCP   3h        name=zookeeper

The log of kafka:

waiting for kafka to be available
+ '[' -n 10.254.144.28 ']'
+ ZOOKEEPER_IP=10.254.144.28
+ '[' -n 2181 ']'
+ ZOOKEEPER_PORT=2181
++ grep '\skafka-1473765933-v4p3c$' /etc/hosts
++ head -n 1
++ awk '{print $1}'
+ IP=172.17.0.6
+ '[' -z '' ']'
+ ZOOKEEPER_CONNECTION_STRING=10.254.144.28:2181
+ cat /kafka/config/server.properties.template
+ sed -e 's|{{KAFKA_ADVERTISED_HOST_NAME}}|kafka.openwhisk|g' -e 's|{{KAFKA_ADVERTISED_PORT}}|9092|g' -e 's|{{KAFKA_AUTO_CREATE_TOPICS_ENABLE}}|true|g' -e 's|{{KAFKA_BROKER_ID}}|0|g' -e 's|{{KAFKA_DEFAULT_REPLICATION_FACTOR}}|1|g' -e 's|{{KAFKA_DELETE_TOPIC_ENABLE}}|false|g' -e 's|{{KAFKA_GROUP_MAX_SESSION_TIMEOUT_MS}}|300000|g' -e 's|{{KAFKA_INTER_BROKER_PROTOCOL_VERSION}}|0.10.2.1|g' -e 's|{{KAFKA_LOG_MESSAGE_FORMAT_VERSION}}|0.10.2.1|g' -e 's|{{KAFKA_LOG_RETENTION_HOURS}}|168|g' -e 's|{{KAFKA_NUM_PARTITIONS}}|1|g' -e 's|{{KAFKA_PORT}}|9092|g' -e 's|{{ZOOKEEPER_CHROOT}}||g' -e 's|{{ZOOKEEPER_CONNECTION_STRING}}|10.254.144.28:2181|g' -e 's|{{ZOOKEEPER_CONNECTION_TIMEOUT_MS}}|10000|g' -e 's|{{ZOOKEEPER_SESSION_TIMEOUT_MS}}|10000|g'
+ '[' -z ']'
+ KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote=true
+ KAFKA_JMX_OPTS='-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false'
+ KAFKA_JMX_OPTS='-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false'
+ KAFKA_JMX_OPTS='-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.rmi.port=7203'
+ KAFKA_JMX_OPTS='-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.rmi.port=7203 -Djava.rmi.server.hostname=kafka.openwhisk '
+ export KAFKA_JMX_OPTS
+ echo 'Starting kafka'
+ exec /kafka/bin/kafka-server-start.sh /kafka/config/server.properties
Starting kafka
waiting for kafka to be available
waiting for kafka to be available
waiting for kafka to be available
waiting for kafka to be available
[2017-11-29 03:25:42,564] INFO KafkaConfig values:
        advertised.host.name = kafka.openwhisk
        advertised.listeners = null
        advertised.port = 9092
        authorizer.class.name =
        auto.create.topics.enable = true
        auto.leader.rebalance.enable = true
        background.threads = 10
        broker.id = 0
        broker.id.generation.enable = true
        broker.rack = null
        compression.type = producer
        connections.max.idle.ms = 600000
        controlled.shutdown.enable = true
        controlled.shutdown.max.retries = 3
        controlled.shutdown.retry.backoff.ms = 5000
        controller.socket.timeout.ms = 30000
        create.topic.policy.class.name = null
        default.replication.factor = 1
        delete.topic.enable = false
        fetch.purgatory.purge.interval.requests = 1000
        group.max.session.timeout.ms = 300000
        group.min.session.timeout.ms = 6000
        host.name =
        inter.broker.listener.name = null
        inter.broker.protocol.version = 0.10.2.1
        leader.imbalance.check.interval.seconds = 300
        leader.imbalance.per.broker.percentage = 10
        listener.security.protocol.map = SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,TRACE:TRACE,SASL_SSL:SASL_SSL,PLAINTEXT:PLAINTEXT
        listeners = null
        log.cleaner.backoff.ms = 15000
        log.cleaner.dedupe.buffer.size = 134217728
        log.cleaner.delete.retention.ms = 86400000
        log.cleaner.enable = true
        log.cleaner.io.buffer.load.factor = 0.9
        log.cleaner.io.buffer.size = 524288
        log.cleaner.io.max.bytes.per.second = 1.7976931348623157E308
        log.cleaner.min.cleanable.ratio = 0.5
        log.cleaner.min.compaction.lag.ms = 0
        log.cleaner.threads = 1
        log.cleanup.policy = [delete]
        log.dir = /data
        log.dirs = /data
        log.flush.interval.messages = 9223372036854775807
        log.flush.interval.ms = null
        log.flush.offset.checkpoint.interval.ms = 60000
        log.flush.scheduler.interval.ms = 9223372036854775807
        log.index.interval.bytes = 4096
        log.index.size.max.bytes = 10485760
        log.message.format.version = 0.10.2.1
        log.message.timestamp.difference.max.ms = 9223372036854775807
        log.message.timestamp.type = CreateTime
        log.preallocate = false
        log.retention.bytes = -1
        log.retention.check.interval.ms = 300000
        log.retention.hours = 168
        log.retention.minutes = null
        log.retention.ms = null
        log.roll.hours = 168
        log.roll.jitter.hours = 0
        log.roll.jitter.ms = null
        log.roll.ms = null
        log.segment.bytes = 1073741824
        log.segment.delete.delay.ms = 60000
        max.connections.per.ip = 2147483647
        max.connections.per.ip.overrides =
        message.max.bytes = 1000012
        metric.reporters = []
        metrics.num.samples = 2
        metrics.recording.level = INFO
        metrics.sample.window.ms = 30000
        min.insync.replicas = 1
        num.io.threads = 8
        num.network.threads = 3
        num.partitions = 1
        num.recovery.threads.per.data.dir = 1
        num.replica.fetchers = 1
        offset.metadata.max.bytes = 4096
        offsets.commit.required.acks = -1
        offsets.commit.timeout.ms = 5000
        offsets.load.buffer.size = 5242880
        offsets.retention.check.interval.ms = 600000
        offsets.retention.minutes = 1440
        offsets.topic.compression.codec = 0
        offsets.topic.num.partitions = 50
        offsets.topic.replication.factor = 3
        offsets.topic.segment.bytes = 104857600
        port = 9092
        principal.builder.class = class org.apache.kafka.common.security.auth.DefaultPrincipalBuilder
        producer.purgatory.purge.interval.requests = 1000
        queued.max.requests = 500
        quota.consumer.default = 9223372036854775807
        quota.producer.default = 9223372036854775807
        quota.window.num = 11
        quota.window.size.seconds = 1
        replica.fetch.backoff.ms = 1000
        replica.fetch.max.bytes = 1048576
        replica.fetch.min.bytes = 1
        replica.fetch.response.max.bytes = 10485760
        replica.fetch.wait.max.ms = 500
        replica.high.watermark.checkpoint.interval.ms = 5000
        replica.lag.time.max.ms = 10000
        replica.socket.receive.buffer.bytes = 65536
        replica.socket.timeout.ms = 30000
        replication.quota.window.num = 11
        replication.quota.window.size.seconds = 1
        request.timeout.ms = 30000
        reserved.broker.max.id = 1000
        sasl.enabled.mechanisms = [GSSAPI]
        sasl.kerberos.kinit.cmd = /usr/bin/kinit
        sasl.kerberos.min.time.before.relogin = 60000
        sasl.kerberos.principal.to.local.rules = [DEFAULT]
        sasl.kerberos.service.name = null
        sasl.kerberos.ticket.renew.jitter = 0.05
        sasl.kerberos.ticket.renew.window.factor = 0.8
        sasl.mechanism.inter.broker.protocol = GSSAPI
        security.inter.broker.protocol = PLAINTEXT
        socket.receive.buffer.bytes = 102400
        socket.request.max.bytes = 104857600
        socket.send.buffer.bytes = 102400
        ssl.cipher.suites = null
        ssl.client.auth = none
        ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
        ssl.endpoint.identification.algorithm = null
        ssl.key.password = null
        ssl.keymanager.algorithm = SunX509
        ssl.keystore.location = null
        ssl.keystore.password = null
        ssl.keystore.type = JKS
        ssl.protocol = TLS
        ssl.provider = null
        ssl.secure.random.implementation = null
        ssl.trustmanager.algorithm = PKIX
        ssl.truststore.location = null
        ssl.truststore.password = null
        ssl.truststore.type = JKS
        unclean.leader.election.enable = true
        zookeeper.connect = 10.254.144.28:2181
        zookeeper.connection.timeout.ms = 10000
        zookeeper.session.timeout.ms = 10000
        zookeeper.set.acl = false
        zookeeper.sync.time.ms = 2000
 (kafka.server.KafkaConfig)
[2017-11-29 03:25:42,621] INFO starting (kafka.server.KafkaServer)
[2017-11-29 03:25:42,623] INFO Connecting to zookeeper on 10.254.144.28:2181 (kafka.server.KafkaServer)
[2017-11-29 03:25:42,635] INFO Starting ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2017-11-29 03:25:42,641] INFO Client environment:zookeeper.version=3.4.9-1757313, built on 08/23/2016 06:50 GMT (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,641] INFO Client environment:host.name=kafka-1473765933-v4p3c (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,641] INFO Client environment:java.version=1.8.0_60 (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,641] INFO Client environment:java.vendor=Oracle Corporation (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,641] INFO Client environment:java.home=/usr/lib/jvm/java-8-oracle/jre (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,641] INFO Client environment:java.class.path=:/kafka/bin/../libs/aopalliance-repackaged-2.5.0-b05.jar:/kafka/bin/../libs/argparse4j-0.7.0.jar:/kafka/bin/../libs/connect-api-0.10.2.1.jar:/kafka/bin/../libs/connect-file-0.10.2.1.jar:/kafka/bin/../libs/connect-json-0.10.2.1.jar:/kafka/bin/../libs/connect-runtime-0.10.2.1.jar:/kafka/bin/../libs/connect-transforms-0.10.2.1.jar:/kafka/bin/../libs/guava-18.0.jar:/kafka/bin/../libs/hk2-api-2.5.0-b05.jar:/kafka/bin/../libs/hk2-locator-2.5.0-b05.jar:/kafka/bin/../libs/hk2-utils-2.5.0-b05.jar:/kafka/bin/../libs/jackson-annotations-2.8.0.jar:/kafka/bin/../libs/jackson-annotations-2.8.5.jar:/kafka/bin/../libs/jackson-core-2.8.5.jar:/kafka/bin/../libs/jackson-databind-2.8.5.jar:/kafka/bin/../libs/jackson-jaxrs-base-2.8.5.jar:/kafka/bin/../libs/jackson-jaxrs-json-provider-2.8.5.jar:/kafka/bin/../libs/jackson-module-jaxb-annotations-2.8.5.jar:/kafka/bin/../libs/javassist-3.20.0-GA.jar:/kafka/bin/../libs/javax.annotation-api-1.2.jar:/kafka/bin/../libs/javax.inject-1.jar:/kafka/bin/../libs/javax.inject-2.5.0-b05.jar:/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/kafka/bin/../libs/javax.ws.rs-api-2.0.1.jar:/kafka/bin/../libs/jersey-client-2.24.jar:/kafka/bin/../libs/jersey-common-2.24.jar:/kafka/bin/../libs/jersey-container-servlet-2.24.jar:/kafka/bin/../libs/jersey-container-servlet-core-2.24.jar:/kafka/bin/../libs/jersey-guava-2.24.jar:/kafka/bin/../libs/jersey-media-jaxb-2.24.jar:/kafka/bin/../libs/jersey-server-2.24.jar:/kafka/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-http-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-io-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-security-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-server-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/kafka/bin/../libs/jetty-util-9.2.15.v20160210.jar:/kafka/bin/../libs/jopt-simple-5.0.3.jar:/kafka/bin/../libs/kafka-clients-0.10.2.1.jar:/kafka/bin/../libs/kafka-log4j-appender-0.10.2.1.jar:/kafka/bin/../libs/kafka-streams-0.10.2.1.jar:/kafka/bin/../libs/kafka-streams-examples-0.10.2.1.jar:/kafka/bin/../libs/kafka-tools-0.10.2.1.jar:/kafka/bin/../libs/kafka_2.12-0.10.2.1-sources.jar:/kafka/bin/../libs/kafka_2.12-0.10.2.1-test-sources.jar:/kafka/bin/../libs/kafka_2.12-0.10.2.1.jar:/kafka/bin/../libs/log4j-1.2.17.jar:/kafka/bin/../libs/lz4-1.3.0.jar:/kafka/bin/../libs/metrics-core-2.2.0.jar:/kafka/bin/../libs/osgi-resource-locator-1.0.1.jar:/kafka/bin/../libs/reflections-0.9.10.jar:/kafka/bin/../libs/rocksdbjni-5.0.1.jar:/kafka/bin/../libs/scala-library-2.12.1.jar:/kafka/bin/../libs/scala-parser-combinators_2.12-1.0.4.jar:/kafka/bin/../libs/slf4j-api-1.7.21.jar:/kafka/bin/../libs/slf4j-log4j12-1.7.21.jar:/kafka/bin/../libs/snappy-java-1.1.2.6.jar:/kafka/bin/../libs/validation-api-1.1.0.Final.jar:/kafka/bin/../libs/zkclient-0.10.jar:/kafka/bin/../libs/zookeeper-3.4.9.jar (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,642] INFO Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,642] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper)
waiting for kafka to be available
[2017-11-29 03:25:42,648] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,648] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,648] INFO Client environment:os.arch=amd64 (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,648] INFO Client environment:os.version=3.10.0-693.2.2.el7.x86_64 (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,648] INFO Client environment:user.name=kafka (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,648] INFO Client environment:user.home=/kafka (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,648] INFO Client environment:user.dir=/kafka (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,649] INFO Initiating client connection, connectString=10.254.144.28:2181 sessionTimeout=10000 watcher=org.I0Itec.zkclient.ZkClient@1fe20588 (org.apache.zookeeper.ZooKeeper)
[2017-11-29 03:25:42,662] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2017-11-29 03:25:42,664] INFO Opening socket connection to server 10.254.144.28/10.254.144.28:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2017-11-29 03:25:42,673] INFO Socket connection established to 10.254.144.28/10.254.144.28:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2017-11-29 03:25:42,686] INFO Session establishment complete on server 10.254.144.28/10.254.144.28:2181, sessionid = 0x10121a8f0f0000c, negotiated timeout = 10000 (org.apache.zookeeper.ClientCnxn)
[2017-11-29 03:25:42,687] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2017-11-29 03:25:42,804] INFO Cluster ID = v198P3b6SfiQqA-bxeU0KA (kafka.server.KafkaServer)
[2017-11-29 03:25:42,805] WARN No meta.properties file under dir /data/meta.properties (kafka.server.BrokerMetadataCheckpoint)
waiting for kafka to be available
[2017-11-29 03:25:42,884] INFO [ThrottledRequestReaper-Fetch], Starting  (kafka.server.ClientQuotaManager$ThrottledRequestReaper)
[2017-11-29 03:25:42,885] INFO [ThrottledRequestReaper-Produce], Starting  (kafka.server.ClientQuotaManager$ThrottledRequestReaper)
[2017-11-29 03:25:42,919] INFO Loading logs. (kafka.log.LogManager)
[2017-11-29 03:25:42,925] INFO Logs loading complete in 6 ms. (kafka.log.LogManager)
[2017-11-29 03:25:43,008] INFO Starting log cleanup with a period of 300000 ms. (kafka.log.LogManager)
[2017-11-29 03:25:43,012] INFO Starting log flusher with a default period of 9223372036854775807 ms. (kafka.log.LogManager)
waiting for kafka to be available
[2017-11-29 03:25:43,049] INFO Awaiting socket connections on 0.0.0.0:9092. (kafka.network.Acceptor)
kafka is up and running
Create health topic
+ echo 'Create health topic'
++ kafka-topics.sh --create --topic health --replication-factor 1 --partitions 1 --zookeeper zookeeper.openwhisk:2181 --config retention.bytes=536870912 --config retention.ms=1073741824 --config segment.bytes=3600000
[2017-11-29 03:25:43,051] INFO [Socket Server on Broker 0], Started 1 acceptor threads (kafka.network.SocketServer)
[2017-11-29 03:25:43,117] INFO [ExpirationReaper-0], Starting  (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2017-11-29 03:25:43,119] INFO [ExpirationReaper-0], Starting  (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2017-11-29 03:25:43,156] INFO Creating /controller (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2017-11-29 03:25:43,170] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
[2017-11-29 03:25:43,170] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector)
[2017-11-29 03:25:43,360] INFO [ExpirationReaper-0], Starting  (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2017-11-29 03:25:43,363] INFO [ExpirationReaper-0], Starting  (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2017-11-29 03:25:43,364] INFO [ExpirationReaper-0], Starting  (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2017-11-29 03:25:43,366] INFO New leader is 0 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2017-11-29 03:25:43,375] INFO [GroupCoordinator 0]: Starting up. (kafka.coordinator.GroupCoordinator)
[2017-11-29 03:25:43,376] INFO [GroupCoordinator 0]: Startup complete. (kafka.coordinator.GroupCoordinator)
[2017-11-29 03:25:43,378] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 2 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-11-29 03:25:43,436] INFO Will not load MX4J, mx4j-tools.jar is not in the classpath (kafka.utils.Mx4jLoader$)
[2017-11-29 03:25:43,455] INFO Creating /brokers/ids/0 (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2017-11-29 03:25:43,474] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
[2017-11-29 03:25:43,475] INFO Registered broker 0 at path /brokers/ids/0 with addresses: EndPoint(kafka.openwhisk,9092,ListenerName(PLAINTEXT),PLAINTEXT) (kafka.utils.ZkUtils)
[2017-11-29 03:25:43,476] WARN No meta.properties file under dir /data/meta.properties (kafka.server.BrokerMetadataCheckpoint)
[2017-11-29 03:25:43,506] INFO Kafka version : 0.10.2.1 (org.apache.kafka.common.utils.AppInfoParser)
[2017-11-29 03:25:43,506] INFO Kafka commitId : e89bffd6b2eff799 (org.apache.kafka.common.utils.AppInfoParser)
[2017-11-29 03:25:43,506] INFO [Kafka Server 0], started (kafka.server.KafkaServer)
Exception in thread "main" org.I0Itec.zkclient.exception.ZkException: Unable to connect to zookeeper.openwhisk:2181
        at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:72)
        at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1228)
        at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:157)
        at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:131)
        at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:106)
        at kafka.utils.ZkUtils$.apply(ZkUtils.scala:88)
        at kafka.admin.TopicCommand$.main(TopicCommand.scala:56)
        at kafka.admin.TopicCommand.main(TopicCommand.scala)
Caused by: java.net.UnknownHostException: zookeeper.openwhisk: unknown error
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
        at java.net.InetAddress.getAllByName(InetAddress.java:1192)
        at java.net.InetAddress.getAllByName(InetAddress.java:1126)
        at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61)
        at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
        at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380)
        at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:70)
        ... 7 more
+ OUTPUT=
+ [[ '' == *\a\l\r\e\a\d\y\ \e\x\i\s\t\s* ]]
+ [[ '' == *\C\r\e\a\t\e\d\ \t\o\p\i\c* ]]
Failed to create heath topic
+ echo 'Failed to create heath topic'
+ exit 1

I was following the documentation and did not make changes to the deployment yaml files.

Thank you,
Hyungro

Simplify deployment by removing the Ansible job

Deployment on Kubernetes today creates a Kubernetes job that runs Ansible playbooks to then create the other Kubernetes resources. It would be great to simplify this so that a user can just kubectl create -f deploy_openwhisk.yml to deploy an entire working OpenWhisk system directly versus deploying a Job that then deploys the other pieces.

This will require pulling all configuration out of Ansible into ConfigMaps or something similar that Kubernetes can use. And the CouchDB and Kafka image start scripts or the images themselves will need to be changed to create the necessary Kafka topics and CouchDB config changes since Ansible won't be running to do this after the images boot.

couldn't create topic in Kafka

error log from controller-0
[2017-11-29T03:44:02.766Z] [ERROR] [??] [KafkaMessagingProvider] exception during creation of topic completed0
[2017-11-29T03:44:02.768Z] [ERROR] [??] [Controller] failure during msgProvider.ensureTopic for topic completed0

error log from invoker-0
[2017-11-29T03:50:04.156Z] [ERROR] [??] [KafkaMessagingProvider] exception during creation of topic invoker0
[2017-11-29T03:50:04.234Z] [ERROR] [#sid_100] [Invoker] failure during msgProvider.ensureTopic for topic invoker0

log from Kafka:
[2017-11-29 03:21:12,941] INFO Kafka version : 0.10.0.1 (org.apache.kafka.common.utils.AppInfoParser)
[2017-11-29 03:21:12,941] INFO Kafka commitId : a7a17cdec9eaa6c5 (org.apache.kafka.common.utils.AppInfoParser)
[2017-11-29 03:21:12,942] INFO [Kafka Server 0], started (kafka.server.KafkaServer)
Create completed topics
Create invoker topics
/start.sh

How to create a new namespace

wsk and wskadmin CLIs doesn't seems to be have options to create a namespace. What should I do to create a new namespace ?

Remove Consul for OpenWhisk

If all OpenWhisk components are fully configured with appropriate configuration values, we should be able to remove Consul from the OpenWhisk deployment.

  • Nuke Consul

only need java runtime

is that possible to only include java runtime in invoker.yml
like this:

  • name: "RUNTIMES_MANIFEST"
    value: '{ "defaultImagePrefix": "openwhisk", "defaultImageTag": "latest", "runtimes": { "java": [ { "kind": "java", "default": true, "image": { "name": "java8action" }, "deprecated": false, "attached": { "attachmentName": "jarfile", "attachmentType": "application/java-archive" }, "sentinelledLogs": false, "requireMain": true } ] }}'

Unable to create action - connection timed out

Hi,

I have deployed openwhisk on a local kubernetes cluster (single node) following step-by-step the docs but I cannot reach the nginx service.

The required pods are running :

root@development:~# kubectl -n openwhisk get pods
NAME                          READY     STATUS    RESTARTS   AGE
apigateway-68f4db64cb-x25lm   1/1       Running   1          5h
controller-0                  1/1       Running   7          5h
controller-1                  1/1       Running   16         5h
couchdb-654f4f74c6-vzk4j      1/1       Running   1          5h
invoker-0                     1/1       Running   1          5h
kafka-746798bbc9-qspxb        1/1       Running   5          5h
nginx-7d9f45fcf4-k5nd6        1/1       Running   1          4h
redis-7c8c49fdd5-9hh67        1/1       Running   1          5h
zookeeper-84b6df89bc-w5k7t    1/1       Running   1          5h

The IP of the Kubernetes Nginx service is 10.110.238.5 and the port 32589

root@development:~# kubectl -n openwhisk describe service nginx
Name:                     nginx
Namespace:                openwhisk
Labels:                   name=nginx
Annotations:              kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"name":"nginx"},"name":"nginx","namespace":"openwhisk"},"spec":{"ports":[{"n...
Selector:                 name=nginx
Type:                     NodePort
IP:                       10.110.238.5
Port:                     http  80/TCP
TargetPort:               80/TCP
NodePort:                 http  30849/TCP
Endpoints:                10.32.0.14:80
Port:                     https-api  443/TCP
TargetPort:               443/TCP
NodePort:                 https-api  32589/TCP
Endpoints:                10.32.0.14:443
Port:                     https-admin  8443/TCP
TargetPort:               8443/TCP
NodePort:                 https-admin  31981/TCP
Endpoints:                10.32.0.14:8443
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

Finally, I setup the wsk cli

wsk property set --auth 23bc46b1-71f6-4ed5-8c54-816aa4f8c502:123zO3xZCLrMN6v2BKK1dXYFpXlPkccOFqm12CdAsMgRU4VrNZ9lyGVCGuMDGIwP --apihost https://10.110.238.5:32589

However, when I try to create an action, I have a connection timed out error

error: Unable to create action 'hello': Put https://10.110.238.5:32589/api/v1/namespaces/_/actions/hello?overwrite=false: dial tcp 10.110.238.5:32589: getsockopt: connection timed out

Is the wsk cli correctly setup (with the correct IP) ?

Thanks.

Invoker was stuck in "ContainerCreating"

Using MiniKube, and follow the instruction to deploy.

When I try to deploy invoker, the invoker-0 was stuck in "ContainerCreating".

The log show:

{"log":"[2017-11-16T03:34:59.712Z] [ERROR] [??] [Invoker] failed to ping the controller: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:34:59.713844727Z"}
{"log":"[2017-11-16T03:35:59.769Z] [ERROR] [??] [KafkaProducerConnector] sending message on topic 'health' failed: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:35:59.773521203Z"}
{"log":"[2017-11-16T03:35:59.774Z] [ERROR] [??] [Invoker] failed to ping the controller: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:35:59.77544446Z"}
{"log":"[2017-11-16T03:36:59.812Z] [ERROR] [??] [KafkaProducerConnector] sending message on topic 'health' failed: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:36:59.814143252Z"}
{"log":"[2017-11-16T03:36:59.813Z] [ERROR] [??] [Invoker] failed to ping the controller: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:36:59.814279723Z"}
{"log":"[2017-11-16T03:37:59.974Z] [ERROR] [??] [KafkaProducerConnector] sending message on topic 'health' failed: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:38:00.053401421Z"}
{"log":"[2017-11-16T03:38:00.089Z] [ERROR] [??] [Invoker] failed to ping the controller: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:38:00.093904605Z"}
{"log":"[2017-11-16T03:39:00.266Z] [ERROR] [??] [KafkaProducerConnector] sending message on topic 'health' failed: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:39:00.268608706Z"}
{"log":"[2017-11-16T03:39:00.268Z] [ERROR] [??] [Invoker] failed to ping the controller: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:39:00.30064349Z"}
{"log":"[2017-11-16T03:40:00.347Z] [ERROR] [??] [KafkaProducerConnector] sending message on topic 'health' failed: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:40:00.359147873Z"}
{"log":"[2017-11-16T03:40:00.360Z] [ERROR] [??] [Invoker] failed to ping the controller: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:40:00.38143021Z"}
{"log":"[2017-11-16T03:41:00.888Z] [ERROR] [??] [KafkaProducerConnector] sending message on topic 'health' failed: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:41:01.073280485Z"}
{"log":"[2017-11-16T03:41:01.074Z] [ERROR] [??] [Invoker] failed to ping the controller: org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.\n","stream":"stdout","time":"2017-11-16T03:41:01.076197685Z"}

It seems like the Invoker having trouble connecting to the kafka.

Enable pod/node affinity/anti-affinity in the deployment

The following deployment policy can help improve the performance and robustness of the deployed OpenWhisk cluster:

  1. controller instance, invoker instance and other services, like kafka, couchdb, can not be deployed on the same K8s node.
  2. For HA purpose, one K8s node can not run more than 1 controller instance or invoker instance

Helm project has implemented a draft version of the above policies by using K8s node affinity and pod affinity/anti-affinity, FYI.

/cc @daisy-ycguo & @dgrove-oss

define preStop hook to enable orderly shutdown of Invoker pod

This issue mainly concerns running in a mode where we deploy the Invoker pod via Kubernetes, but still use docker to schedule the individual action containers.

When an invoker exits in an orderly fashion, it attempts to cleanup its actionContainers by invoking removeAllActionContainers from a JVM shutdown hook (see https://github.com/apache/incubator-openwhisk/blob/0cb847c0906f58fee1166938977708d99261c1c5/core/invoker/src/main/scala/whisk/core/containerpool/docker/DockerContainerFactory.scala#L76).

If the invoker crashes (or its pod is terminated by Kubernetes for some other reason), this shutdown hook may not be executed or may be executed in a way that kills running actions. We should use the preStop lifecycle hook to enable a more orderly shutdown of the invoker and its running actions in the case when the invoker is still healthy and it is being descheduled by Kubernetes for some other reason.

Note that if the invoker pod is restarted by Kubernetes on the same worker node, one of its first actions will be to invoke removeAllActionContainers from the init method of DockerContainerFactory https://github.com/apache/incubator-openwhisk/blob/0cb847c0906f58fee1166938977708d99261c1c5/core/invoker/src/main/scala/whisk/core/containerpool/docker/DockerContainerFactory.scala#L73. So the leaked resources from a previously crashed invoker should be reclaimed when the node is re-used for a future instance of the invoker.

document architectural options for Invoker deployment

This issue is to document options for deploying the invoker subsystem for OpenWhisk. The topic has been discussed in various venues before, most recently in a review of #107 by @stigsb.

The key choice to make when deploying invokers is what implementation of the ContainerFactoryProvider SPI to use. There are currently two approaches being used by downsteam consumers of this project:

DockerContainerFactory

In this approach, the Kubernetes scheduler is only used to deploy the OpenWhisk "control plane". All of the user action containers are created, managed, and destroyed by the invoker using docker on the Kubernetes worker node. For this approach to work well, it is essential that there is exactly 1 invoker pod per worker node that is intended for user function execution. Using a Daemonset for the invokers is a natural fit, since the nodes intended for the invoker to use will be fairly static and can be labeled accordingly. Capacity is added/removed from the system by adding/removing worker nodes to the cluster and/or adding/removing the invoker label to the worker nodes.

This approach has the advantage of supporting low latency suspend/resume operations, but gives up some of the advantages of running on Kubernetes because it keeps the Kubernetes scheduler in the dark and forces a relatively static allocation of worker nodes to OpenWhisk invokers.

KubernetesContainerFactory

In this approach, the Kubernetes scheduler is used for all container operations: both control plane and user containers are created, managed, and destroyed by Kubernetes. In this approach, it is highly likely that the number of invoker pods will be much smaller than the number of worker nodes in the cluster. Furthermore, it is likely that some form of autoscaling could be applied to dynamically vary the number of invokers to match system load (although #84 is needed to really make autoscaling work well).

This approach allows better sharing of compute resources between OpenWhisk and other uses of the Kubernetes cluster. However, the current KubernetesContainer (https://github.com/projectodd/incubator-openwhisk/blob/d2eb77aac212fb9970f3c9f914bf5863dcbefe50/core/invoker/src/main/scala/whisk/core/containerpool/kubernetes/KubernetesContainer.scala#L105 and https://github.com/projectodd/incubator-openwhisk/blob/d2eb77aac212fb9970f3c9f914bf5863dcbefe50/core/invoker/src/main/scala/whisk/core/containerpool/kubernetes/KubernetesContainer.scala#L108) does not actually implement the suspend/resume actions, so cannot be used if suspension of warm containers is a deployment requirement.

Define RUNTIME_MANIFESTS via configmap

We can make it easier to synchronize with the upstream project if we use a configmap for the RUNTIME_MANIFESTS instead of burying it within both controller.yml and invoker.yml.

Error when list action

after I setup the OpenWhisk in Kubernetes, I got below error message:

$wsk action list --insecure
error: Unable to obtain the list of actions for namespace 'default': There was an internal server error. (code 7)
I check the log of controller, I got this:
[2017-11-28T07:32:48.415Z] [INFO] [#tid_16478] GET /api/v1/namespaces/_/actions limit=30&skip=0
[2017-11-28T07:32:48.415Z] [INFO] [#tid_16478] [RestAPIVersion] authenticate: 23bc46b1-71f6-4ed5-8c54-816aa4f8c502
[2017-11-28T07:32:48.415Z] [INFO] [#tid_16478] [Identity] [GET] serving from cache: CacheKey(23bc46b1-71f6-4ed5-8c54-816aa4f8c502) [marker:database_cacheHit_count:1]
[2017-11-28T07:32:48.415Z] [INFO] [#tid_16478] [RestAPIVersion] authentication valid
[2017-11-28T07:32:48.416Z] [INFO] [#tid_16478] [LocalEntitlementProvider] checking user 'guest' has privilege 'READ' for 'actions/guest'
[2017-11-28T07:32:48.416Z] [INFO] [#tid_16478] [ActivationThrottler] concurrent activations in system = 0, below limit = 5000
[2017-11-28T07:32:48.416Z] [INFO] [#tid_16478] [LocalEntitlementProvider] authorized
[2017-11-28T07:32:48.416Z] [INFO] [#tid_16478] [ActionsApi] [LIST] exclude private entities: required == false
[2017-11-28T07:32:48.417Z] [INFO] [#tid_16478] [CouchDbRestStore] [QUERY] 'test_whisks' searching 'whisks/actions [marker:database_queryView_start:2]
[2017-11-28T07:32:48.420Z] [ERROR] [#tid_16478] [CouchDbRestStore] Unexpected http response code: 404 Not Found [marker:database_queryView_error:6:3]
[2017-11-28T07:32:48.420Z] [ERROR] [#tid_16478] [CouchDbRestStore] [QUERY] 'test_whisks' internal error, failure: 'Unexpected http response code: 404 Not Found' [marker:database_queryView_error:6:3]
[2017-11-28T07:32:48.420Z] [ERROR] [#tid_16478] [ActionsApi] [LIST] entity failed: Unexpected http response code: 404 Not Found
[2017-11-28T07:32:48.421Z] [INFO] [#tid_16478] [BasicHttpService] [marker:http_get.500_count:7:7]

is there any data missing in couchDB ?

WHISK_API_HOST_NAME not set correctly in invoker.yml

invoker.yml sets WHISK_API_HOST_NAME to nginx.openwhisk (the internal service name for the nginx engine). This is wrong; the only use of this envvar is to set __OW_API_HOST in the user function container's environment -- we need any call backs from the user containers to go in through the front door ingress, not short-circuit to an internal route (that will not even be reachable to them from a properly secured deployment).

issue about span job to other node

If there are too many jobs to run , how the system(openwhisk on kubernetes ) span jobs to other node ?
kubenetes span openwhisk components or job to other node?

If a new job come, openwhisk restart a docker container to run the job, right? how does openwhisk allocate and free resource? resource is managed by openwhisk or by kubernetes ? and when do kubernetes span openwhisk components or job to other node? how to load balance?

Anyone who can help me?

Fix runc issues to enable container reuse

I'm seeing similar errors as reported in apache/openwhisk#2788 with docker-runc getting a Go deserialization error attempting to read the state.json for a container when executing in kubernetes.

[2017-09-29T20:15:54.551Z] [ERROR] [#sid_102] [RuncClient] code: 1, stdout: , stderr: json: cannot unmarshal object into Go value of type []string [marker:invoker_runc.pause_error:6830148:259]

I think the right fix for kubernetes might be to setup a sidecar container with the appropriate docker version so we keep all of the action containers contained within the invoker pod. See https://applatix.com/case-docker-docker-kubernetes-part-2/ for some discussion. This would get better confinement of the containers to the pod. An alternative fix would be to figure out how to mount the runc from the workernode.

Attaching a shell to the invoker pod, I can reproduce the docker-runc pause error manually. I can also see that docker pause/unpause work as expected if I use those commands.

In any case, we are currently not getting container reuse with kubernetes via runc currently with the default InvokerReactive so performance is not what we would expect.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.