GithubHelp home page GithubHelp logo

Comments (6)

weisdd avatar weisdd commented on June 2, 2024

Seems like you're confusing two different scenarios here.

Persistent volumes

When we're talking about Persistent Volumes, they come empty and they are writable. If you deploy the example you referred to as is, a new volume becomes provisioned through a PVC, the volume gets mounted to /var/lib/grafana, and it'll all works just fine:

image

The snippet (with fsGroup) you shared is contradicting, because it looks like you're not using persistent volumes.

Ephemeral volumes

When you deploy a basic example (quoted below), any changes in Grafana are not persistent (they're gone once the respective pod is gone). To make sure Grafana has enough permissions to store its data, an emptyDir volume is automatically mounted to /var/lib/grafana. Again, everything works just fine.

apiVersion: grafana.integreatly.org/v1beta1
kind: Grafana
metadata:
  name: grafana
  labels:
    dashboards: "grafana"
spec:
  config:
    log:
      mode: "console"
    auth:
      disable_login_form: "false"
    security:
      admin_user: root
      admin_password: secret
image

What I think is happening

uid 472 is used in the default grafana image:

image

But grafana-operator automatically adds runAsNonRoot: true to a pod securityContext, which changes the default uid to something else, the exact id is likely to be different in your case:

image

Assuming you're using persistent volumes (unlike in the snippet you shared), I guess you deployed grafana with modified securityContext (runAsNonRoot: false or something else; directly or through mutating webhooks), grafana created some files, then you redeployed it with another set of settings, and now grafana fails to write data to the pre-existing files, because they had been created with different permissions.

When you specify a custom fsGroup, Kubernetes changes ownership and permissions for files upon pod start (docs).

I don't think we need to update the example. If you redeploy grafana with a brand new volume using an up-to-date version of the operator, it should all just work.

from grafana-operator.

caguiclajmg avatar caguiclajmg commented on June 2, 2024

I'm also seeing what @harrythecode is reporting, this time I made sure that I have no PV/PVCs prior to deploying the Grafana object (and yes, I see a PV getting created so I'm certain the persistentVolumeClaim option in the manifest is taking effect), so a volume with stale files and wrong permissions is unlikely to be the cause.

Currently running v5.6.3 of the operator.

from grafana-operator.

weisdd avatar weisdd commented on June 2, 2024

@caguiclajmg @harrythecode It'd be helpful if you could share more information around your environment:

  • Kubernetes / OpenShift?
  • Version;
  • Details on underlying storage that you're using (e.g. local storage, a managed disk from a cloud provider, etc);
  • Full Grafana manifest;
  • Full Deployment manifest;
  • Full Pod manifest;
  • Full PersistentVolumeClaim manifest.

Also, if it's something that can be reproduced in local (kind, microk8s, ...) or cloud provider environment, then full instructions would be helpful.

from grafana-operator.

bavarian-ng avatar bavarian-ng commented on June 2, 2024

Hi,

I ran into the same issue just today, when I tried to use a persistent volume claim for Grafana in a Kubernetes Cluster (Amazon EKS). With the options from the example yaml I also experienced the same error "missing file permissions".
(I also started from scratch, no existing pvcs and so on)
I played around a bit and also found this Github issue, which helped me in debugging.

When I add:
spec.deployment.spec.template.spec.securityContext.fsGroup: 10001
It seems to work. Maybe this helps in digging into the root cause of this issue. :)

I removed some individual stuff from the YAML I use, but with the following yaml it seems to run at the moment:

kind: Grafana
metadata:
  name: grafana
  namespace: monitoring
  labels:
    dashboards: "grafana"
spec:
  persistentVolumeClaim:
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
  config:
    log:
      mode: "console"
    [...]
  deployment:
    spec:
      template:
        spec:
          serviceAccountName: secrets-csi-sa-monitoring
          securityContext:
            fsGroup: 10001
          containers:
            - name: grafana
              securityContext:
                allowPrivilegeEscalation: true
                readOnlyRootFilesystem: false
              readinessProbe:
                failureThreshold: 3
              [...]
              image: grafana/grafana:10.4.0
               [...]
          volumes:
            - name: grafana-data
              persistentVolumeClaim:
                claimName: grafana-pvc
           [...]

from grafana-operator.

weisdd avatar weisdd commented on June 2, 2024

@bavarian-ng Thanks for reporting this, though, unfortunately, it's not enough to share only Grafana CR here as the end pod spec can be influenced by various webhooks. Please, take a look at my comment above, which describes which of the resources can give us a better understanding of the things you experience in your cluster.

from grafana-operator.

bavarian-ng avatar bavarian-ng commented on June 2, 2024

@weisdd , sorry here the requested info:

  • Kubernetes / OpenShift?: Kubernetes
  • Version: v1.27
  • Details on underlying storage that you're using: AWS GP2
# kubectl describe storageclass gp2
Name:            gp2
IsDefaultClass:  Yes
Annotations:     kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"},"name":"gp2"},"parameters":{"fsType":"ext4","type":"gp2"},"provisioner":"kubernetes.io/aws-ebs","volumeBindingMode":"WaitForFirstConsumer"}
,storageclass.kubernetes.io/is-default-class=true
Provisioner:           kubernetes.io/aws-ebs
Parameters:            fsType=ext4,type=gp2
AllowVolumeExpansion:  <unset>
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     WaitForFirstConsumer
Events:                <none>
  • Full Grafana manifest;
apiVersion: grafana.integreatly.org/v1beta1
kind: Grafana
metadata:
  name: {{ .Values.metadata.app_name }}
  namespace: {{ .Values.metadata.namespace }}
  labels:
    dashboards: "grafana"
spec:
  persistentVolumeClaim:
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi     
  config:
    log:
      mode: "console"
    auth:
      disable_login_form: "false"
    auth.google:
      enabled: "true"
      scopes: https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email
      auth_url: https://accounts.google.com/o/oauth2/auth
      token_url: https://oauth2.googleapis.com/token
      allowed_domains: <REDACTED OUR DOMAIN>
      allow_sign_up: "true"
    server:
      root_url: https://{{ .Values.ingress.servicename }}.{{ .Values.metadata.stage }}.{{ .Values.ingress.dns_zone }}
      serve_from_sub_path: "true"
    users:
      auto_assign_org_role: "Editor"
  deployment:
    spec:
      template:
        spec:
          serviceAccountName: secrets-csi-sa-monitoring
          securityContext:
            fsGroup: 10001
          containers:
            - name: grafana
              securityContext:
                allowPrivilegeEscalation: true
                readOnlyRootFilesystem: false
              readinessProbe:
                failureThreshold: 3
              env:
                - name: GF_AUTH_GOOGLE_CLIENT_ID
                  valueFrom:
                    secretKeyRef:
                      name: grafana-google-sso
                      key: client_id
                - name: GF_AUTH_GOOGLE_CLIENT_SECRET
                  valueFrom:
                    secretKeyRef:
                      name: grafana-google-sso
                      key: client_secret
                - name: GF_INSTALL_PLUGINS
                  value: grafana-oncall-app
                - name: GF_SECURITY_ADMIN_USER
                  valueFrom:
                    secretKeyRef:
                      name: grafana-admin-creds
                      key: adminuser
                - name: GF_SECURITY_ADMIN_PASSWORD
                  valueFrom:
                    secretKeyRef:
                      name: grafana-admin-creds
                      key: adminpassword
              image: grafana/grafana:10.4.0
              volumeMounts:
              - name: secret-store-grafana
                mountPath: "/mnt/secrets"
              - name: secret-store-grafana-sso
                mountPath: "/mnt/sso-secrets/"
              - name: plugin-config
                mountPath: "/etc/grafana/provisioning/plugins/"
              resources:
                limits:
                  memory: {{ .Values.deployment.containers.grafana.resources.limits.memory }}
                requests:
                  cpu: {{ .Values.deployment.containers.grafana.resources.requests.cpu }}
                  memory: {{ .Values.deployment.containers.grafana.resources.requests.memory }}
          volumes:
            - name: grafana-data
              persistentVolumeClaim:
                claimName: grafana-pvc
            - name: plugin-config
              configMap:
                name: grafana-oncall-plugin-config
            - name: plugin-folder
              hostPath:
                path: /etc/grafana/provisioning/plugins
                type: DirectoryOrCreate
            - name: secret-store-grafana
              csi:
                driver: secrets-store.csi.k8s.io
                readOnly: true
                volumeAttributes:
                  secretProviderClass: "spc-grafana"
            - name: secret-store-grafana-sso
              csi:
                driver: secrets-store.csi.k8s.io
                readOnly: true
                volumeAttributes:
                  secretProviderClass: "spc-grafana-sso"

  service:
    spec:
      type: NodePort
      ports: 
        - protocol: TCP
          port: 3000
  ingress:
    metadata:
      annotations:
        kubernetes.io/ingress.class: alb
        alb.ingress.kubernetes.io/scheme: internet-facing
        alb.ingress.kubernetes.io/target-type: instance
        alb.ingress.kubernetes.io/load-balancer-name: central-loadbalancer-{{ .Values.metadata.stage }}
        alb.ingress.kubernetes.io/backend-protocol: HTTP
        alb.ingress.kubernetes.io/listen-ports: '[{"HTTP":80}, {"HTTPS":443}]'
        alb.ingress.kubernetes.io/ssl-redirect: '443'
        alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-TLS13-1-2-2021-06
        alb.ingress.kubernetes.io/certificate-arn: {{ .Values.ingress.certificate_arn }}
        alb.ingress.kubernetes.io/subnets: 'steelmntn-{{ .Values.metadata.stage  }}-vpc-public-eu-central-1a,steelmntn-{{ .Values.metadata.stage  }}-vpc-public-eu-central-1b,steelmntn-{{ .Values.metadata.stage  }}-vpc-public-eu-central-1c'
        alb.ingress.kubernetes.io/group.name: steelmountain-ingress-group
        external-dns.alpha.kubernetes.io/hostname: {{ .Values.ingress.servicename }}.{{ .Values.metadata.stage }}.{{ .Values.ingress.dns_zone }}
        external-dns.alpha.kubernetes.io/ttl: "60"
        external-dns.alpha.kubernetes.io/ingress-hostname-source: annotation-only
    spec:
      ingressClassName: alb
      rules:
        - host: {{ .Values.ingress.servicename }}.{{ .Values.metadata.stage }}.{{ .Values.ingress.dns_zone }}
          http:
            paths:
              - backend:
                  service:
                    name: {{ .Values.service.name }}
                    port:
                      number: 3000
                path: /
                pathType: Prefix
      tls:
        - hosts:
            - {{ .Values.ingress.servicename }}.{{ .Values.metadata.stage }}.{{ .Values.ingress.dns_zone }}

  • Full Deployment manifest; (resulting deployment created by operator:)
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: '1'
  creationTimestamp: '2024-03-11T14:41:22Z'
  generation: 1
  name: grafana-deployment
  namespace: monitoring
  ownerReferences:
    - apiVersion: grafana.integreatly.org/v1beta1
      kind: Grafana
      name: grafana
      uid: 67785905-eb4c-4922-a68a-fe4821aebfb0
  resourceVersion: '90847627'
  uid: 855a9b34-532c-4440-ace1-88aadf728f6a
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: grafana
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: grafana
      name: grafana-deployment
    spec:
      containers:
        - args:
            - '-config=/etc/grafana/grafana.ini'
          env:
            - name: GF_AUTH_GOOGLE_CLIENT_ID
              valueFrom:
                secretKeyRef:
                  key: client_id
                  name: grafana-google-sso
            - name: GF_AUTH_GOOGLE_CLIENT_SECRET
              valueFrom:
                secretKeyRef:
                  key: client_secret
                  name: grafana-google-sso
            - name: PLUGINS_HASH
              valueFrom:
                configMapKeyRef:
                  key: PLUGINS_HASH
                  name: grafana-plugins
                  optional: true
            - name: CONFIG_HASH
              value: a5f231ba1d21e0391bebb602b1d9f4066be7b5a625ffc17c5a9396e121888feb
            - name: GF_INSTALL_PLUGINS
              value: grafana-oncall-app
            - name: TMPDIR
              value: /var/lib/grafana
            - name: GF_SECURITY_ADMIN_USER
              valueFrom:
                secretKeyRef:
                  key: adminuser
                  name: grafana-admin-creds
            - name: GF_SECURITY_ADMIN_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: adminpassword
                  name: grafana-admin-creds
          image: 'grafana/grafana:10.4.0'
          imagePullPolicy: IfNotPresent
          name: grafana
          ports:
            - containerPort: 3000
              name: grafana-http
              protocol: TCP
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /api/health
              port: 3000
              scheme: HTTP
            initialDelaySeconds: 5
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 3
          resources:
            limits:
              memory: 256Mi
            requests:
              cpu: 64m
              memory: 256Mi
          securityContext:
            allowPrivilegeEscalation: true
            capabilities:
              drop:
                - ALL
            privileged: false
            readOnlyRootFilesystem: false
            runAsGroup: 10001
            runAsNonRoot: true
            runAsUser: 10001
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
            - mountPath: /mnt/secrets
              name: secret-store-grafana
            - mountPath: /mnt/sso-secrets/
              name: secret-store-grafana-sso
            - mountPath: /etc/grafana/provisioning/plugins/
              name: plugin-config
            - mountPath: /etc/grafana/
              name: grafana-ini
            - mountPath: /var/lib/grafana
              name: grafana-data
            - mountPath: /var/log/grafana
              name: grafana-logs
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        fsGroup: 10001
        seccompProfile:
          type: RuntimeDefault
      serviceAccount: secrets-csi-sa-monitoring
      serviceAccountName: secrets-csi-sa-monitoring
      terminationGracePeriodSeconds: 30
      volumes:
        - configMap:
            defaultMode: 420
            name: grafana-ini
          name: grafana-ini
        - emptyDir: {}
          name: grafana-logs
        - name: grafana-data
          persistentVolumeClaim:
            claimName: grafana-pvc
        - configMap:
            defaultMode: 420
            name: grafana-oncall-plugin-config
          name: plugin-config
        - hostPath:
            path: /etc/grafana/provisioning/plugins
            type: DirectoryOrCreate
          name: plugin-folder
        - csi:
            driver: secrets-store.csi.k8s.io
            readOnly: true
            volumeAttributes:
              secretProviderClass: spc-grafana
          name: secret-store-grafana
        - csi:
            driver: secrets-store.csi.k8s.io
            readOnly: true
            volumeAttributes:
              secretProviderClass: spc-grafana-sso
          name: secret-store-grafana-sso

  • Full Pod manifest; (resulting manifest deployed by Operator)
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: '2024-03-11T14:41:22Z'
  generateName: grafana-deployment-7bcdb6d464-
  labels:
    app: grafana
    pod-template-hash: 7bcdb6d464
  name: grafana-deployment-7bcdb6d464-pdgdt
  namespace: monitoring
  ownerReferences:
    - apiVersion: apps/v1
      blockOwnerDeletion: true
      controller: true
      kind: ReplicaSet
      name: grafana-deployment-7bcdb6d464
      uid: cd7e44e8-2337-4b70-93c7-310ce84f5490
  resourceVersion: '90847622'
  uid: 4564c1bf-a682-466c-9cb9-2767ca0c7dad
spec:
  containers:
    - args:
        - '-config=/etc/grafana/grafana.ini'
      env:
        - name: GF_AUTH_GOOGLE_CLIENT_ID
          valueFrom:
            secretKeyRef:
              key: client_id
              name: grafana-google-sso
        - name: GF_AUTH_GOOGLE_CLIENT_SECRET
          valueFrom:
            secretKeyRef:
              key: client_secret
              name: grafana-google-sso
        - name: PLUGINS_HASH
          valueFrom:
            configMapKeyRef:
              key: PLUGINS_HASH
              name: grafana-plugins
              optional: true
        - name: CONFIG_HASH
          value: a5f231ba1d21e0391bebb602b1d9f4066be7b5a625ffc17c5a9396e121888feb
        - name: GF_INSTALL_PLUGINS
          value: grafana-oncall-app
        - name: TMPDIR
          value: /var/lib/grafana
        - name: GF_SECURITY_ADMIN_USER
          valueFrom:
            secretKeyRef:
              key: adminuser
              name: grafana-admin-creds
        - name: GF_SECURITY_ADMIN_PASSWORD
          valueFrom:
            secretKeyRef:
              key: adminpassword
              name: grafana-admin-creds
        - name: AWS_STS_REGIONAL_ENDPOINTS
          value: regional
        - name: AWS_DEFAULT_REGION
          value: eu-central-1
        - name: AWS_REGION
          value: eu-central-1
        - name: AWS_ROLE_ARN
          value: >-
           <REDACTED ROLE ARN>
        - name: AWS_WEB_IDENTITY_TOKEN_FILE
          value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
      image: 'grafana/grafana:10.4.0'
      imagePullPolicy: IfNotPresent
      name: grafana
      ports:
        - containerPort: 3000
          name: grafana-http
          protocol: TCP
      readinessProbe:
        failureThreshold: 3
        httpGet:
          path: /api/health
          port: 3000
          scheme: HTTP
        initialDelaySeconds: 5
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 3
      resources:
        limits:
          memory: 256Mi
        requests:
          cpu: 64m
          memory: 256Mi
      securityContext:
        allowPrivilegeEscalation: true
        capabilities:
          drop:
            - ALL
        privileged: false
        readOnlyRootFilesystem: false
        runAsGroup: 10001
        runAsNonRoot: true
        runAsUser: 10001
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
        - mountPath: /mnt/secrets
          name: secret-store-grafana
        - mountPath: /mnt/sso-secrets/
          name: secret-store-grafana-sso
        - mountPath: /etc/grafana/provisioning/plugins/
          name: plugin-config
        - mountPath: /etc/grafana/
          name: grafana-ini
        - mountPath: /var/lib/grafana
          name: grafana-data
        - mountPath: /var/log/grafana
          name: grafana-logs
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-szvnd
          readOnly: true
        - mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
          name: aws-iam-token
          readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: <REDACTED NODE NAME>
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 10001
    seccompProfile:
      type: RuntimeDefault
  serviceAccount: secrets-csi-sa-monitoring
  serviceAccountName: secrets-csi-sa-monitoring
  terminationGracePeriodSeconds: 30
  tolerations:
    - effect: NoExecute
      key: node.kubernetes.io/not-ready
      operator: Exists
      tolerationSeconds: 300
    - effect: NoExecute
      key: node.kubernetes.io/unreachable
      operator: Exists
      tolerationSeconds: 300
  volumes:
    - name: aws-iam-token
      projected:
        defaultMode: 420
        sources:
          - serviceAccountToken:
              audience: sts.amazonaws.com
              expirationSeconds: 86400
              path: token
    - configMap:
        defaultMode: 420
        name: grafana-ini
      name: grafana-ini
    - emptyDir: {}
      name: grafana-logs
    - name: grafana-data
      persistentVolumeClaim:
        claimName: grafana-pvc
    - configMap:
        defaultMode: 420
        name: grafana-oncall-plugin-config
      name: plugin-config
    - hostPath:
        path: /etc/grafana/provisioning/plugins
        type: DirectoryOrCreate
      name: plugin-folder
    - csi:
        driver: secrets-store.csi.k8s.io
        readOnly: true
        volumeAttributes:
          secretProviderClass: spc-grafana
      name: secret-store-grafana
    - csi:
        driver: secrets-store.csi.k8s.io
        readOnly: true
        volumeAttributes:
          secretProviderClass: spc-grafana-sso
      name: secret-store-grafana-sso
    - name: kube-api-access-szvnd
      projected:
        defaultMode: 420
        sources:
          - serviceAccountToken:
              expirationSeconds: 3607
              path: token
          - configMap:
              items:
                - key: ca.crt
                  path: ca.crt
              name: kube-root-ca.crt
          - downwardAPI:
              items:
                - fieldRef:
                    apiVersion: v1
                    fieldPath: metadata.namespace
                  path: namespace
  • Full PersistentVolumeClaim manifest.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: 'yes'
    pv.kubernetes.io/bound-by-controller: 'yes'
    volume.beta.kubernetes.io/storage-provisioner: ebs.csi.aws.com
    volume.kubernetes.io/selected-node: <NODE NAME REDACTED>
    volume.kubernetes.io/storage-provisioner: ebs.csi.aws.com
  creationTimestamp: '2024-03-11T14:41:22Z'
  finalizers:
    - kubernetes.io/pvc-protection
  name: grafana-pvc
  namespace: monitoring
  ownerReferences:
    - apiVersion: grafana.integreatly.org/v1beta1
      kind: Grafana
      name: grafana
      uid: 67785905-eb4c-4922-a68a-fe4821aebfb0
  resourceVersion: '90847469'
  uid: 0a718f39-5eb2-4a23-9d29-b630be8539f6
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: gp2
  volumeMode: Filesystem
  volumeName: pvc-0a718f39-5eb2-4a23-9d29-b630be8539f6
status:
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 10Gi
  phase: Bound

from grafana-operator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.