GithubHelp home page GithubHelp logo

Comments (11)

rosmo avatar rosmo commented on May 24, 2024 1

Also you might consider the Gateway resource as well: https://cloud.google.com/kubernetes-engine/docs/how-to/deploying-gateways

from gke-autoneg-controller.

natemurthy avatar natemurthy commented on May 24, 2024

Pinging a few top contributors to get some 👀 on this @rosmo @fdfzcq

from gke-autoneg-controller.

rosmo avatar rosmo commented on May 24, 2024

Few questions:

  1. how did you deploy Autoneg?
  2. can you check the autoneg-controller-manager logs for context around the error?
  3. you do have Workload Identity enabled on your cluster? (and proper mapping between the KSA and GCP service account, eg. annotation on the KSA and IAM for SA)
  4. there is a backend called https-be etc. (all the config makes sense)

from gke-autoneg-controller.

natemurthy avatar natemurthy commented on May 24, 2024
  1. I deployed Autoneg with
PROJECT_ID=${PROJECT_ID} deploy/workload_identity.sh  # runs a few gcloud commands

kubectl apply -f deploy/autoneg.yaml

kubectl annotate sa -n autoneg-system autoneg-controller-manager \
  iam.gke.io/gcp-service-account=autoneg-system@${PROJECT_ID}.iam.gserviceaccount.com
  1. Here's what I could gather from the manager container logs:
nathan:hello-cluster$ kubectl -n=autoneg-system logs autoneg-controller-manager-f5ddc69b8-vtpw5 -c manager

1.6807986132289813e+09	INFO	controller-runtime.metrics	Metrics server is starting to listen	{"addr": "127.0.0.1:8080"}
1.6807986132294307e+09	INFO	setup	starting manager
1.6807986132303178e+09	INFO	Starting server	{"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:8080"}
1.6807986132304182e+09	INFO	Starting server	{"kind": "health probe", "addr": "[::]:8081"}
I0406 16:30:13.230520       1 leaderelection.go:248] attempting to acquire leader lease autoneg-system/9fe89c94.controller.autoneg.dev...
I0406 16:30:13.240961       1 leaderelection.go:258] successfully acquired lease autoneg-system/9fe89c94.controller.autoneg.dev
1.6807986132411487e+09	DEBUG	events	Normal	{"object": {"kind":"Lease","namespace":"autoneg-system","name":"9fe89c94.controller.autoneg.dev","uid":"ba03cd7e-28e5-40e8-ac14-69c65f3b5341","apiVersion":"coordination.k8s.io/v1","resourceVersion":"7573723"}, "reason": "LeaderElection", "message": "autoneg-controller-manager-f5ddc69b8-vtpw5_3798322b-67e3-4fde-bb42-a672bc56ecca became leader"}
1.6807986132413092e+09	INFO	Starting EventSource	{"controller": "service", "controllerGroup": "", "controllerKind": "Service", "source": "kind source: *v1.Service"}
1.680798613241329e+09	INFO	Starting Controller	{"controller": "service", "controllerGroup": "", "controllerKind": "Service"}
1.6807986134227598e+09	INFO	Starting workers	{"controller": "service", "controllerGroup": "", "controllerKind": "Service", "worker count": 1}
1.680799113217557e+09	INFO	Applying intended status	{"controller": "service", "controllerGroup": "", "controllerKind": "Service", "service": {"name":"frontend-svc","namespace":"default"}, "namespace": "default", "name": "frontend-svc", "reconcileID": "58153b0f-7640-4a7c-995c-37a708c11a9a", "service": "default/frontend-svc", "status": {"backend_services":{"443":{"https-be":{"name":"https-be","max_connections_per_endpoint":1000}},"80":{"http-be":{"name":"http-be","max_rate_per_endpoint":100}}},"network_endpoint_groups":{"443":"k8s1-1f4ed5c4-default-frontend-svc-443-9757dbe8"},"zones":["us-central1-f"]}}
1.6807991133377185e+09	ERROR	Reconciler error	{"controller": "service", "controllerGroup": "", "controllerKind": "Service", "service": {"name":"frontend-svc","namespace":"default"}, "namespace": "default", "name": "frontend-svc", "reconcileID": "58153b0f-7640-4a7c-995c-37a708c11a9a", "error": "googleapi: Error 403: Request had insufficient authentication scopes.\nDetails:\n[\n  {\n    \"@type\": \"type.googleapis.com/google.rpc.ErrorInfo\",\n    \"domain\": \"googleapis.com\",\n    \"metadatas\": {\n      \"method\": \"compute.v1.BackendServicesService.Get\",\n      \"service\": \"compute.googleapis.com\"\n    },\n    \"reason\": \"ACCESS_TOKEN_SCOPE_INSUFFICIENT\"\n  }\n]\n\nMore details:\nReason: insufficientPermissions, Message: Insufficient Permission\n"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234
  1. Yes, to enable Workload Identity I ran. How can I check if the annotation on the KSA and GSA are properly bound/mapped?
gcloud container clusters update hello-cluster \
    --zone=$DEFAULT_ZONE \
    --workload-pool=$PROJECT_ID.svc.id.goog
  1. I don't see https-be when I run gcloud compute backend-services list.

from gke-autoneg-controller.

rosmo avatar rosmo commented on May 24, 2024

The backend service needs to be created beforehand and outside of Autoneg (eg. manually, via gcloud or by Terraform).

from gke-autoneg-controller.

natemurthy avatar natemurthy commented on May 24, 2024

I see. Do you have some template gcloud compute backend-services create commands I could try? Can you also add some notes in the Autoneg README about this and the order of operations?

It's a bit confusing because the GCP docs don't reference Autoneg anywhere, and there are countably infinite ways of configuring load balancers + backends + ingresses + NEGs + instance groups + IAM rules + ... + etc. What I usually do is:

  1. Create deployment with kubectl
  2. Create service with kubectl
  3. Created managed cert with kubectl (if a new one is needed, usually accompanied by a new Cloud DNS A record beforehand)
  4. Create or update an ingress with path to k8s service using kubectl
    • This step automatically creates load balancer associated with any DNS + cert
    • It also creates backend service with network endpoint group
    • But it does not reliably or automatically create network endpoint to backend service

My guess is that I would need to create a new backend service before I perform step (4) and then do another manual step of configuring the load balancer created in that step to use the backend service I created manually (unless Autoneg does this for me).

from gke-autoneg-controller.

rosmo avatar rosmo commented on May 24, 2024

If you can leverage GKE Ingress notation, I suggest you use that. Autoneg was mainly created for two things: one where a different team manages the load balancer components and other where people wanted to use some features that weren't available in GKE Ingress controller.

from gke-autoneg-controller.

natemurthy avatar natemurthy commented on May 24, 2024

For greater context, my use case is that I'm using gcloud container clusters create (GKE Standard mode) as opposed to gcloud container clusters create-auto (GKE Autopilot mode). My current issue with Standard mode is that when I create a GKE Ingress, the network endpoints are not automatically created with the node, pod IP, and port for the load balancer's (LB) network endpoint group.

So, for example, consider the frontend pod you see below:

nathan:hello-cluster$ k get po -o wide
NAME                             READY   STATUS    RESTARTS   AGE     IP          NODE                                             NOMINATED NODE   READINESS GATES
frontend-app-7fc967db4b-7m5tf    1/1     Running   0          15d     10.80.2.4   gke-hello-cluster-default-pool-a7743c1e-p8q4     <none>           <none>
hello-app-b5cd5796b-dn9ml        1/1     Running   0          15d     10.80.0.7   gke-hello-cluster-default-pool-a7743c1e-9bql     <none>           <none>
streamlit-app-565f54d89b-z92cw   1/1     Running   0          5d18h   10.80.3.2   gke-hello-cluster-analytics-pool-53ce8fb0-x2g7   <none>           <none>

I have to create the network endpoint for clients connecting through the LB to reach this pod manually today like so:

2023-04-11 13 04 58

which is obviously not the best practice. Without a network endpoint properly mapped between pod, port, and node for any given NEG, clients hitting my load balancer observe HTTP 502: failed_to_pick_backend .

You mention the GKE Ingress notation and the GKE Ingress controller. Here's what my Ingress definition currently looks like:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: frontend-app
  annotations:
    kubernetes.io/ingress.global-static-ip-name: "frontend-static-ip"
    networking.gke.io/managed-certificates: frontend-managed-cert
    kubernetes.io/ingress.class: "gce"
spec:
  defaultBackend:
    service:
      name: frontend-svc
      port:
        number: 443

Do you see something missing from my YAML that are available off-the-shelf with GKE which would address the networking endpoint issue I'm experiencing but that I'm not using?

It could be that I'm not using Autoneg for its intended purposes, since my situation is not in your list. But I haven't found a working solution to this with just GKE alone (unless there's a solution that exists which is not documented).

from gke-autoneg-controller.

rosmo avatar rosmo commented on May 24, 2024

Do you have the cloud.google.com/neg: '{"ingress": true}' annotation on your frontend-svc service? Although I think this should not be required on GKE 1.17+ as per the documentation here: https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#container-native_load_balancing

You might also be required to use a NodePort service. It also takes a while for all the necessary components to be created (5 minutes+).

from gke-autoneg-controller.

natemurthy avatar natemurthy commented on May 24, 2024

Thanks! I added that annotation to the YAML spec for the service (configured as NodePort) but am still seeing the 403: Request had insufficient authentication scopes error. Here's what my frontend-svc now looks like:

apiVersion: v1
kind: Service
metadata:
  name: frontend-svc
  annotations:
    cloud.google.com/neg: '{"ingress": true, "exposed_ports": {"443":{}}}'
    controller.autoneg.dev/neg: '{"backend_services":{"443":[{"name":"frontend-https-be","max_connections_per_endpoint":1000}]}}'
spec:
  selector:
    app: frontend-app
  type: NodePort
  ports:
    - protocol: TCP
      port: 443
      targetPort: 3000

and here are the events when I kubectl describe svc frontend-svc it;

  Type     Reason        Age                 From                Message
  ----     ------        ----                ----                -------
  Normal   Sync          38s                 autoneg-controller  Synced NEGs for "default/frontend-svc" as backends to backend service "frontend-https-be" (port 443)
  Warning  BackendError  13s (x13 over 38s)  autoneg-controller  googleapi: Error 403: Request had insufficient authentication scopes.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.ErrorInfo",
    "domain": "googleapis.com",
    "metadatas": {
      "method": "compute.v1.BackendServicesService.Get",
      "service": "compute.googleapis.com"
    },
    "reason": "ACCESS_TOKEN_SCOPE_INSUFFICIENT"
  }
]

More details:
Reason: insufficientPermissions, Message: Insufficient Permission

I understand that Gateway in GCP is an evolution of Ingress, although it appears to be in v1beta1. I have the most experience working with ingresses (though on other platforms such as AWS and a few on-prem k8s clusters).

from gke-autoneg-controller.

natemurthy avatar natemurthy commented on May 24, 2024

Turns out I didn't need Autoneg after all. Resolved with: https://stackoverflow.com/a/76040721/1773216

from gke-autoneg-controller.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.