knative / serving Goto Github PK
View Code? Open in Web Editor NEWKubernetes-based, scale-to-zero, request-driven compute
Home Page: https://knative.dev/docs/serving/
License: Apache License 2.0
Kubernetes-based, scale-to-zero, request-driven compute
Home Page: https://knative.dev/docs/serving/
License: Apache License 2.0
modify webhook stuff to watch for all the types and have it call to specific validation / mutation on specific resources.
I believe this should be Google (at least for now).
Stuff like:
Here are the k8s guidelines: https://github.com/kubernetes/community/blob/master/contributors/devel/pull-requests.md#the-testing-and-merge-workflow
We should set the status of Revision to a something meaningful. In case of failures it should be obvious what happened.
I think that until we have some sort of GC based on reachability from ElaService
in place, we should be setting the OwnerReference
on Revisions
to be the RevisionTemplate
.
It is really annoying to have to clean these up all the time :)
I'm open to discussing leaving these around as a "feature" once it's not such a resource leak.
@vaikas-google @evankanderson WDYT? I think this'd be a good starter task, if you guys agree?
These default readiness checks keep Steren's sample-app from coming up.
We need to validate the objects coming in. For ElaService for example, we should make sure that the routes add to 100% and so forth.
Our current helloworld shows 100% of the traffic hitting a revision from a revisiontemplate.
It would be easy to show:
Push new version of the template, new revision is created, traffic shifted automatically
( just modify revisiontemplate.yaml with a different TARGET and run bazel ... .apply
Traffic will shift to the new revision (look at elaservice status) and curling
do manual splitting by configuring the service to push 50/50 between revisions above
others...
We should document how to set up a custom domain and path-based URL dispatch using a reverse proxy like nginx.
We probably want to be able to direct logs to the customer's Prometheus instance. A few questions:
Feel free to contact me (argent@) for details.
A good example of this is the setCondition
(and removeCondition
) method I'm adding in the Build PR, which simply manipulates the datatype in a particular way.
We should triage what else we have that does this and move things there for the greater good of all mankind.
Tentatively assigning to myself as I'd like to triage this for the Build repo as well.
@imjasonh FYI
Given the ingress service public IP, is this just an A record created by the user in their DNS?
Creating istio resources on a minikube cluster currently fails with dozens of errors like this:
Error from server (Forbidden): error when retrieving current configuration of:
&{0xc4228d86c0 0xc4203ecc40 istio-system STDIN 0xc4228ce3c0 false}
from server for: "STDIN": namespaces "istio-system" is forbidden: User "system:anonymous" cannot get namespaces in the namespace "istio-system": Unknown user "system:anonymous"
I've tried all the following configurations, and all have failed with the same errors on bazel run :everything.apply
:
minikube start --kubernetes-version v1.8.0 --vm-driver=kvm2
minikube start --kubernetes-version v1.9.0 --vm-driver=kvm2
minikube start --kubernetes-version v1.9.0 --vm-driver kvm2 --extra-config=apiserver.Authorization.Mode=RBAC
minikube start --kubernetes-version v1.9.0 --vm-driver kvm2 --feature-gates=AllAlpha=true
I verified my user is an admin with this command:
kubectl auth can-i create clusterroles
#=> yes
I attempted to give the system:anonymous
user cluster-admin privileges like this, but that didn't change the errors:
kubectl create clusterrolebinding anon-cluster-admin --clusterrole=cluster-admin --user=system:anonymous
#=> clusterrolebinding "anon-cluster-admin" created
# In case it's a serviceaccount (though no such service account exists)
kubectl create clusterrolebinding anon-cluster-admin --clusterrole=cluster-admin --serviceaccount=system:anonymous
#=> clusterrolebinding "anon-cluster-admin" created
On the 1.8 clusters, it seems RBAC is not enabled:
kubectl get clusterroles
#=> No resources found.
But it is enabled on the 1.9 clusters (get clusterroles
returns a non-empty list).
Is anyone else able to run bazel run :everything.apply
on minikube successfully?
Wire this into the placeholder k8s service that can then do 0->1 scaling. Find a way to do this so that it's only one per cluster
We weren't using symbolic enums for things like .status.conditions[*].type
or .status.conditions[*].status
because of limitations in the apiserver-builder
(IIUC).
We should use symbolic enums for these (for the latter: corev1.ConditionStatus
).
We probably want to be able to direct logs to the customer's ELK instance. A few questions:
status.logUrl
)Feel free to contact me (argent@) for details.
We should support passing imagePullSecrets to Revisions, so that they can pull from a private registry without GKE's built-in implicit authentication for GCR.
at least on master
To help streamline end-user testing we should create and host release binaries and container images for the Elafros control plane components.
At minimum, need aggregation data across K8s, Istio, and Elafros control plane
A couple of options to start with
Mainly clarity of triggers (2 reviewers? can author merge if they have the rights?)
CODEOWNERS is a github magic file (docs). PRs that touch owned paths automatically request review from owners (which can be users or teams). We can optionally require approval from a code owner to merge.
We need to add validation for the resource.
We currently have a field in the Revision.Spec that specifies if a Revision is active or not. It has not been re-implemented in this version. We should look at using that as a way to control whether k8s resources are created or not. That would make it cleaner for something like the ElaService to flip the revision to an active or inactive state (0->1 and 1->0).
This work includes adding an enum to indicate the desired serving state. And to implement the conroller logic that makes that state happen.
When a RevisionTemplate is deleted, it should also cause all the associated Revisions to be deleted.
For example, when I run bazel run sample/helloworld:everything.delete
the elaservice and revisiontemplate go away. But the revisions do not. And they must be cleaned up manually with kubectl delete revisions --all
.
Anecdotal evidence: Many people (PM, Eng, UX) introduced to Archetype concept were confused at first.
We had ~15 people sitting in the UI design sprint and general sentiment was “What’s going on with this Archetype thing? Why does it have to be so weird? Why is it a thing at all?”. There is a desire to aim at hiding the concept of Archetype from new users in the UI and CLI -- this is not a good sign for intuitiveness of the concept. People use alternative names in the discussions -- also not a good sign for intuitiveness of the concept and of the name.
Per UX review ( @steren @qelo ):
The API allows an N:N relation between Services and {Archetypes, Revisions}. This can be confusing to users and can be difficult to represent in user interfaces. Broken down:
Why is this a problem for the UI? The desire in the UI is to start with the main Services view and make things accessible from there, e.g. inside a service you see its revisions etc. However, this hierarchy is a lie compared to API and this creates incompatibilities and issues, e.g.:
Let’s say I have an Archetype without a Service (created via API or whatever) -- where in the UI I could see it? We would need a new “Archetypes” view, separate from our main “Services” view, which could be distracting to customers.
Let’s say I want to show details of a particular Revision. Assume the user got there from service X. I’d like to show the Revision in the context of X, and just say that the Revision is serving 50% for X, or is accessible at this URL etc. However, to fully show Revision details, the UI would need to determine that the Revision is also serving for unrelated service Y, or is routable with a different URL for service Z. This is magnified if you can reach the Revision from other paths, such as Archetype view or a Revision list view.
When a customer drills down to a Service, they want to be able to see a list of Revisions related to this service. If N:N relationships are present (and it can be difficult to determine if that is the case without a full list of resources in the namespace), it is difficult to suggest a query to find the relevant Revisions.
We expect that the UI displays a single list of revisions (probably sorted by creation date). If these Revisions have been created by different Archetype, then displaying a single list might not map to the mental model of our users (they might consider them to in fact be multiple independent histories). Displaying multiple revision lists for a given Service complexifies the UI.
and create a cheap and cheerful autoscaler that uses qps/latency/whatever heuristics to rescale a HelioRevision pods
The Build repo (once vendored) has methods for validating Build and BuildTemplate resources.
We should use these in our validation logic to ensure we're getting a valid build specification.
Inside the config we use ela- with the hyphen, so rename the files accordingly.
On a relatively new cluster (less than an hour old), I ran the demo with bazel run sample/demo/helloworld:everything.apply
. Everything appeared to work, except the ingress did not appear in kubectl get ingress
immediately. It took at least 15 seconds to show in that list (I checked a minute later and it was there). Statuses of the RevisionTemplate and Revision were as expected. Logs show that the ingress was created immediately:
I0129 19:00:36.153595 1 controller.go:278] Running reconcile ElaService for elaservice-example
&{TypeMeta:{Kind: APIVersion:} ObjectMeta:{Name:elaservice-example GenerateName: Namespace:default SelfLink:/apis/elafros.dev/v1alpha1/namespaces/default/elaservices/elaservice-example UID:b06c282d-0526-11e8-bfae-42010af00290 ResourceVersion:1228 Generation:0 CreationTimestamp:2018-01-29 19:00:36 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[] Annotations:map[kubectl.kubernetes.io/last-applied-configuration:{"apiVersion":"elafros.dev/v1alpha1","kind":"ElaService","metadata":{"annotations":{},"name":"elaservice-example","namespace":"default"},"spec":{"domainSuffix":"demo.googlecustomer.net","rollout":{"traffic":[{"percent":100,"revisionTemplate":"revisiontemplate-example"}]}}}
] OwnerReferences:[] Initializers:nil Finalizers:[] ClusterName:} Spec:{DomainSuffix:demo.googlecustomer.net ServiceType: Rollout:{Traffic:[{Name: Revision: RevisionTemplate:revisiontemplate-example Percent:100}]} Current: Next: RolloutPercentToNext:0 ForceReconcile:} Status:{Current: RolloutPercentage: Next: Conditions:[]}}
2018/01/29 19:00:36 Creating/Updating placeholder k8s services
2018/01/29 19:00:36 Created service: "elaservice-example-service"
2018/01/29 19:00:36 Creating or updating ingress rule
2018/01/29 19:00:36 Created ingress "elaservice-example-ela-ingress"
2018/01/29 19:00:36 Creating istio route rules
...
Once the ingress was created, there were no events attached to it:
Name: elaservice-example-ela-ingress
Namespace: default
Address: 35.227.54.4
Default backend: default-http-backend:80 (10.28.0.8:8080)
Rules:
Host Path Backends
---- ---- --------
demo.googlecustomer.net
elaservice-example-service:http (<none>)
Annotations:
Events: <none>
During the period of invisibility, I only tried listing all ingresses; I didn't try to get the ingress by name.
Nobody else has seen this behavior. Even if GKE was experiencing delays attaching IPs, we'd expect the resource to appear in the list immediately.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.