projectriff / fats Goto Github PK
View Code? Open in Web Editor NEWFaaS Acceptance Test Suite
FaaS Acceptance Test Suite
There's a lot of useful logic in the uppercase/run.sh script that should also be used for other functions. It should be abstracted out to support other functions without a lot of duplication.
The Minikube registry is painfully slow. We may be able to improve performance by running the container registry directly in the Docker daemon instead of inside k8s.
...removed since it is consistently failing and blocking other tests
The .travis.fats-trigger.sh script stopped triggering fats. See https://travis-ci.org/projectriff/riff/jobs/449162152
Travis CI now has Windows available as a runtime. It would be nice to test the Windows CLI binary.
cc @fbiville
We run FATS on GKE via TravisCI, but it should also be easy for someone to configure a local cluster and run the core of the test suite against that cluster. Decoupling functions/run.sh
will make it easier to smoke test various kubernetes distributions, even if we don't have them run automatically (minikube on travis is a bit rough). we could also decide to run on other managed k8s environments via travis.
New release has been done 3 days ago: https://github.com/kubernetes/minikube/releases/tag/v0.29.0
It includes the removal of localkube: kubernetes/minikube#2911.
This currently blocks FATS on minikube: https://travis-ci.org/projectriff/fats/jobs/435507850#L508.
Currently we share a single PKS cluster for all jobs. Because riff uses a number of cluster scoped resources, we cannot run multiple jobs against a single cluster concurrently. Moreover, there is a risk of leaking state between runs causing false test results.
We can create a new cluster for each run so long as we are willing to wait for the cluster to be provisioned (about an hour right now ๐คข) and setup a load balancer to target the master node.
Note: travis builds will time out after 10 minutes of inactivity. We can prefix long running tasks with travis_wait 70
where 70 is the number of minutes to wait.
217.0.0-0
was a bag a hurt, so we're currently pinned to 216.0.0-0
. At some point we should get back in sync with the latest.
The "user" portion of the registry prefix is a holdover from when riff namespace init
had a --registry-user
flag. This was never part of the registry API and has been correctly removed from riff. We should also remove it from FATS.
We need to add support and tests for applications.
FATS is a pile a bash scripts. Bash is useful for quick and dirty scripts, but does not scale well as more people need to interact consume and author the system. Especially as bash is not a common language across the team, and no-one is particularly strong (or admits to being strong).
There are a few key aspects of FATS that I'd like to preserve:
Some things we can do better:
Any k8s runtime should be able to use any image registry. We should be able to mix and match the registry with the k8s runtime more easily.
Right now we assume:
This cluster type was removed in this PR: #147
The knative service web hook was registered but not running which caused resource to fail to be created. We should wait for all deployments to be ready before proceeding.
$ kubectl get deployments --all-namespaces
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
istio-system istio-citadel 1 1 1 1 1m
istio-system istio-egressgateway 1 1 1 1 1m
istio-system istio-galley 1 1 1 1 1m
istio-system istio-ingressgateway 1 1 1 1 1m
istio-system istio-pilot 1 1 1 1 1m
istio-system istio-policy 1 1 1 1 1m
istio-system istio-sidecar-injector 1 1 1 1 1m
istio-system istio-statsd-prom-bridge 1 1 1 1 1m
istio-system istio-telemetry 1 1 1 1 1m
istio-system knative-ingressgateway 1 1 1 1 13s
knative-build build-controller 1 1 1 1 14s
knative-build build-webhook 1 1 1 1 14s
knative-eventing eventing-controller 1 1 1 1 12s
knative-eventing stub-clusterbus-dispatcher 1 1 1 1 6s
knative-eventing webhook 1 1 1 0 12s
knative-serving activator 1 1 1 1 13s
knative-serving autoscaler 1 1 1 0 13s
knative-serving controller 1 1 1 1 13s
knative-serving webhook 1 1 1 0 13s
kube-system event-exporter-v0.2.1 1 1 1 1 3m
kube-system fluentd-gcp-scaler 1 1 1 1 3m
kube-system heapster-v1.5.3 1 1 1 1 3m
kube-system kube-dns 2 2 2 2 3m
kube-system kube-dns-autoscaler 1 1 1 1 3m
kube-system l7-default-backend 1 1 1 1 3m
kube-system metrics-server-v0.2.1 1 1 1 1 3m
$ ./run.sh
Current function scenario: uppercase
~/gopath/src/github.com/projectriff/fats/functions/uppercase/command ~/gopath/src/github.com/projectriff/fats
[2018-10-04T21:33:27Z] Creating fats-uppercase-command as command:
Error: Internal error occurred: failed calling admission webhook "webhook.serving.knative.dev": Post https://webhook.knative-serving.svc:443/?timeout=30s: no endpoints available for service "webhook"
run eval $(minikube docker-env)
to ensure that proper env variables are set.
We limited the job concurrency to 1 because we sharing a single PKS cluster for builds. Once we are creating a PKS cluster per job, we can remove the concurrency limits.
Needs #52
After deleting a GKE cluster, there are a number of orphaned resources including health checks and persistent volumes. We either need to figure out how to get these resources to be cleaned up automatically or purge them after the cluster is destroyed.
Right now we invoke each function once, which didn't detect an issue with the node invoker where the second request failed. We should add a few additional invocations.
FATS tests assume they are running in a clean CI machine. They make liberal use of sudo
and system directories. While shared, external resources are cleaned up, there is minimal attempt made to cleanup local resources, or create resources in a way that won't impact the broader environment.
Specific, actionable issues should be created and worked based on the pains faced by riff developers testing riff with FATS.
FATS coverage is required to release streaming runtime with riff v0.5.0
Feb 6: bumped to v0.6.0.
This combination was disabled in this PR: #147
This script restarts the docker daemon, which is problematic if the cluster is already up and running.
This also applies for registries/docker-daemon/configure.sh
Tests will fail until updated
Depends on projectriff/node-function-buildpack#161
Pivotal provides the Gitbot service to synchronize issues and pull requests made against public GitHub repos with Pivotal Tracker projects.
If you are a Pivotal employee, you can configure Gitbot to sync your GitHub repo to your Pivotal Tracker project with a pull request.
Steps:
If you are not a pivotal employee, you can request that [email protected] set up the integration for you.
You might also be interested in configuring GitHub's Service Hook for Tracker on your repo so you can link your commits to Tracker stories. You can do this yourself by following the directions at:
https://www.pivotaltracker.com/blog/guide-githubs-service-hook-tracker/
If you do not want to use Pivotal Tracker to manage this GitHub repo, Please add this repo to the Ignored repositories
list
If there are any questions, please reach out to [email protected].
This OS was disabled in this PR: #147
We should run on:
Istio fails to start when using Minikube with DockerHub as a registry. Strangely, Minikube works when using the Minikube registry add-on.
The failure occurs while waiting for Istio to start during the riff install.
See https://travis-ci.org/projectriff/fats/jobs/467600214#L550
A couple times now we've been hit by STOCKOUT errors when trying to create a GKE cluster. Currently FATS is hard coded to us-central1a. We should minimally pick a zone at random for the region and probably also pick a region dynamically as well. This will make retries much more likely to succeed. In the future, we could detect the STOCKOUT error and try a new zone within the same job.
The STOCKOUT will also impact PKS on GCP, however, PKS chooses where to place the new cluster rather than the client...
Stalled jobs can orphan a GKE cluster. Retriggering the job will fail when it attempts to create a new cluster, but the name already exists. Before we create a new GKE cluster we should check if there is an existing cluster and if so, delete it.
This behavior only effects Travis because it reuses build ids when retriggering.
We're now using Azure pipelines for our builds instead of Travis, but we still use a number of Travis idioms like travis_retry, travis_fold and some TRAVIS_* environment variables.
We should convert retry and fold into generic bash functions and eliminate assumptions about TRAVIS env vars existing. We should be careful not to couple to any Azure specific functionality as we may still want to run builds on Travis (or another environment) in the future. For example, the fold function should continue to work as-is on Travis, but not pollute the log output when running on AZP.
The java and node invokers currently support streaming functions in addition to request-reply. FATS should test these flavors of functions
Currently, FATS is running once per day via cron. It would be nice to detect issues quicker by running FATS after every successful master branch build of a dependency repo. We can use the Travis API to trigger builds from each repo.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.