uc-cdis / gen3-helm Goto Github PK

View Code? Open in Web Editor NEW

8.0 13.0 22.0 3.92 MB

Helm charts for Gen3 Deployments

License: Apache License 2.0

Smarty 61.23% Python 31.60% CSS 0.05% HCL 0.05% Shell 1.21% JavaScript 5.86%

gen3-helm's Issues

Possible misconfigured in nginx.conf leading to null `user_id` value

Hello! Thanks for all the work put into this project, excited to use it in our deployments.

@bwalsh, @matthewpeterkort, @jawadqur

Issue Encountered

Encountering some 401/403 errors related to authentication.

Logs and Info

{"gen3log": "nginx", "date_access": "2023-03-03T00:32:29+00:00", "user_id": "-"

Possible Fix

  js_import helpers.js;
  js_set $userid userid; # <- Should be `helpers.userid` like the credentials line further down

  ...

  js_set $credentials_allowed helpers.isCredentialsAllowed;

After applying the above edit the user_id filed in the nginx logs was the expected value and the 401/403 errors were reduced:

{"gen3log": "nginx", "date_access": "2023-03-03T00:32:29+00:00", "user_id": "[email protected]"

Reference

https://nginx.org/en/docs/http/ngx_http_js_module.html

Duplicate keys PYTHONPATH in fence/templates/useryaml-job.yaml

There are duplicate keys of PYTHONPATH in /fence/templates/useryaml-job.yaml

        env:
          {{- toYaml .Values.env | nindent 10 }}
          - name: PYTHONPATH
            value: /var/www/fence

This is due to the rendering of the array of values in values.yaml where

...
  - name: DB
    value: postgresql://$(PGUSER):$(PGPASSWORD)@$(PGHOST):5432/$(PGDB)
  - name: PYTHONPATH
    value: /var/www/fence
  - name: FENCE_PUBLIC_CONFIG
    valueFrom:
      configMapKeyRef:
        name: manifest-fence
...

causes a duplicate and renders to

apiVersion: batch/v1
kind: Job
metadata:
  name: useryaml
spec:
  backoffLimit: 10
  template:
    metadata:
      labels:
        app: gen3job
    spec:
      automountServiceAccountToken: false
      volumes:
        - name: fence-config
          secret:
            secretName: "fence-config"
        - name: useryaml
          configMap:
            name: useryaml
      containers:
      - name: fence
        image: "quay.io/cdis/fence:master"
        imagePullPolicy: Always
        env:
          - name: DD_ENABLED
            valueFrom:
              configMapKeyRef:
                key: dd_enabled
                name: manifest-global
                optional: true
          - name: DD_ENV
            valueFrom:
              fieldRef:
                fieldPath: metadata.labels['tags.datadoghq.com/env']
          - name: DD_SERVICE
            valueFrom:
              fieldRef:
                fieldPath: metadata.labels['tags.datadoghq.com/service']
          - name: DD_VERSION
            valueFrom:
              fieldRef:
                fieldPath: metadata.labels['tags.datadoghq.com/version']
          - name: DD_LOGS_INJECTION
            value: "true"
          - name: DD_PROFILING_ENABLED
            value: "true"
          - name: DD_TRACE_SAMPLE_RATE
            value: "1"
          - name: GEN3_UWSGI_TIMEOUT
            valueFrom:
              configMapKeyRef:
                key: uwsgi-timeout
                name: manifest-global
                optional: true
          - name: DD_AGENT_HOST
            valueFrom:
              fieldRef:
                fieldPath: status.hostIP
          - name: AWS_STS_REGIONAL_ENDPOINTS
            value: regional
          - name: PYTHONPATH
            value: /var/www/fence
          - name: GEN3_DEBUG
            value: "False"
          - name: FENCE_PUBLIC_CONFIG
            valueFrom:
              configMapKeyRef:
                key: fence-config-public.yaml
                name: manifest-fence
                optional: true
          - name: PGHOST
            valueFrom:
              secretKeyRef:
                key: host
                name: fence-dbcreds
                optional: false
          - name: PGUSER
            valueFrom:
              secretKeyRef:
                key: username
                name: fence-dbcreds
                optional: false
          - name: PGPASSWORD
            valueFrom:
              secretKeyRef:
                key: password
                name: fence-dbcreds
                optional: false
          - name: PGDB
            valueFrom:
              secretKeyRef:
                key: database
                name: fence-dbcreds
                optional: false
          - name: DB
            value: postgresql://$(PGUSER):$(PGPASSWORD)@$(PGHOST):5432/$(PGDB)
          - name: PYTHONPATH
            value: /var/www/fence
        volumeMounts:
          - name: "fence-config"
            readOnly: true
            mountPath: "/var/www/fence/fence-config.yaml"
            subPath: fence-config.yaml
          - name: "useryaml"
            mountPath: "/var/www/fence/user.yaml"
            subPath: useryaml
        command: ["/bin/bash" ]
        args:
          - "-c"
          # Script always succeeds if it runs (echo exits with 0)
          - |
            fence-create sync --arborist http://arborist-service --yaml /var/www/fence/user.yaml
      restartPolicy: OnFailure

Read me instrucion on helm deployment are not detailed.

I can see the gen3-helm Read.Me has not been updated as last when Jawad Qureshi gave the talk about ''Helm Charts for Desktop''.

See value.yaml

nginx unable to retrieve schema.json

Hi All! Thanks for all the effort put into this helm project! Using your instructions we were able to get a Gen3 local deployment via Helm up and running with custom TLS certs.

@bwalsh, @matthewpeterkort, @jawadqur

Issue Encountered

We are running into a small bug regarding schema.json, specifically with retrieving it during run time from the data portal.

Logs and Info

Judging by the rexproxy logs nginx is unable to find schema.json due to it not being in the expected directory within the portal deployment:

2023/03/02 22:08:42 [error] 18#18: *29 open() "/usr/share/nginx/html/data/schema.json" failed (2: No such file or directory), client: 10.42.0.125, server: , request:
"GET //data/schema.json HTTP/1.0", host: "aced-training.compbio.ohsu.edu", referrer: "https://aced-training.compbio.ohsu.edu/_root"
10.42.0.125 - - [02/Mar/2023:22:08:42 +0000] "GET //data/schema.json HTTP/1.0" 404 153 "https://aced-training.compbio.ohsu.edu/_root" "Mozilla/5.0 (Macintosh; Intel M
ac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/110.0" "10.42.0.34"

Successful fetching of the data schema upon building the data portal:

INFO: Running schema and relay for non-workspace bundle

> [email protected] schema
> node ./data/getSchema

Fetching http://revproxy-service/api/v0/submission/getschema
Fetching http://revproxy-service/api/v0/submission/_dictionary/_all
All done!

schema.json is successfully written to the /data-portal/data directory:

root@portal-deployment-7dcd4c8446-8gk7d:/data-portal/data# ls
config           dictionaryHelper.js       getSchema.js  gqlHelper.js.njk  parameters.js   schema.json
dictionary.json  dictionaryHelper.test.js  getTexts.js   gqlSetup.js       schema.graphql  utils.js

root@portal-deployment-7dcd4c8446-8gk7d:/data-portal/data# ls -alt schema.json
-rw-r--r-- 1 root root 5624461 Mar  2 22:11 schema.json

schema.json is not written or copied to the the nginx directory where it can then be served in the data portal:

ls -alt /usr/share/nginx/html/data/schema.json
ls: cannot access '/usr/share/nginx/html/data/schema.json': No such file or directory

Current Fix

Manually copying the file from /data/schema.json to the the /usr/share/nginx/html/data/ directory solves the issue.

Duplicate key terminationGracePeriodSeconds in wts deployment

terminationGracePeriodSeconds is duplicated in wts/templates/deployment.yaml

snippet:

      terminationGracePeriodSeconds: 10
      volumes:
        - name: wts-secret
          secret:
            secretName: "wts-g3auto"
      terminationGracePeriodSeconds: 10

unable to upload certain files from the UI

I am using the default DD and simulated data using the data simulator.
experiment metadata file upload error
"403 Client Error: No username / password configured in indexd for url: http://indexd-service/index/",

Duplicate env section in fence presigned-url-fence.yaml

There is a duplicate env section in presigned-url-fence.yaml

          env:
            {{- toYaml .Values.env | nindent 12 }}
          volumeMounts:
            {{- toYaml .Values.volumeMounts | nindent 12 }}
          env:
            {{ toYaml .Values.initEnv | nindent 12 }}

which renders to the following (pointing out the extra section after the volume mounts and the duplicate secret references)

containers:
        - name: presigned-url-fence
          image: "quay.io/cdis/fence:master"
          imagePullPolicy: Always
          ports:
            - name: http
              containerPort: 80
              protocol: TCP
            - name: https
              containerPort: 443
              protocol: TCP
            - name: container
              containerPort: 6567
              protocol: TCP
          livenessProbe:
            httpGet:
              path: /_status
              port: http
            initialDelaySeconds: 30
            periodSeconds: 60
            timeoutSeconds: 30
          readinessProbe:
            httpGet:
              path: /_status
              port: http
          resources:
            limits:
              cpu: 1
              memory: 2400Mi
            requests:
              cpu: 100m
              memory: 128Mi
          command: ["/bin/bash"]
          args:
            - "-c"
            - |
              echo "${FENCE_PUBLIC_CONFIG:-""}" > "/var/www/fence/fence-config-public.yaml"
              python /var/www/fence/yaml_merge.py /var/www/fence/fence-config-public.yaml /var/www/fence/fence-config-secret.yaml > /var/www/fence/fence-config.yaml
              if [[ -f /fence/keys/key/jwt_private_key.pem ]]; then
                openssl rsa -in /fence/keys/key/jwt_private_key.pem -pubout > /fence/keys/key/jwt_public_key.pem
              fi
              bash /fence/dockerrun.bash && if [[ -f /dockerrun.sh ]]; then bash /dockerrun.sh; fi
          env:
            - name: DD_ENABLED
              valueFrom:
                configMapKeyRef:
                  key: dd_enabled
                  name: manifest-global
                  optional: true
            - name: DD_ENV
              valueFrom:
                fieldRef:
                  fieldPath: metadata.labels['tags.datadoghq.com/env']
            - name: DD_SERVICE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.labels['tags.datadoghq.com/service']
            - name: DD_VERSION
              valueFrom:
                fieldRef:
                  fieldPath: metadata.labels['tags.datadoghq.com/version']
            - name: DD_LOGS_INJECTION
              value: "true"
            - name: DD_PROFILING_ENABLED
              value: "true"
            - name: DD_TRACE_SAMPLE_RATE
              value: "1"
            - name: GEN3_UWSGI_TIMEOUT
              valueFrom:
                configMapKeyRef:
                  key: uwsgi-timeout
                  name: manifest-global
                  optional: true
            - name: DD_AGENT_HOST
              valueFrom:
                fieldRef:
                  fieldPath: status.hostIP
            - name: AWS_STS_REGIONAL_ENDPOINTS
              value: regional
            - name: PYTHONPATH
              value: /var/www/fence
            - name: GEN3_DEBUG
              value: "False"
            - name: FENCE_PUBLIC_CONFIG
              valueFrom:
                configMapKeyRef:
                  key: fence-config-public.yaml
                  name: manifest-fence
                  optional: true
            - name: PGHOST
              valueFrom:
                secretKeyRef:
                  key: host
                  name: fence-dbcreds
                  optional: false
            - name: PGUSER
              valueFrom:
                secretKeyRef:
                  key: username
                  name: fence-dbcreds
                  optional: false
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  key: password
                  name: fence-dbcreds
                  optional: false
            - name: PGDB
              valueFrom:
                secretKeyRef:
                  key: database
                  name: fence-dbcreds
                  optional: false
            - name: DB
              value: postgresql://$(PGUSER):$(PGPASSWORD)@$(PGHOST):5432/$(PGDB)
          volumeMounts:
            - mountPath: /var/www/fence/local_settings.py
              name: old-config-volume
              readOnly: true
              subPath: local_settings.py
            - mountPath: /var/www/fence/fence_credentials.json
              name: json-secret-volume
              readOnly: true
              subPath: fence_credentials.json
            - mountPath: /var/www/fence/creds.json
              name: creds-volume
              readOnly: true
              subPath: creds.json
            - mountPath: /var/www/fence/config_helper.py
              name: config-helper
              readOnly: true
              subPath: config_helper.py
            - mountPath: /fence/fence/static/img/logo.svg
              name: logo-volume
              readOnly: true
              subPath: logo.svg
            - mountPath: /fence/fence/static/privacy_policy.md
              name: privacy-policy
              readOnly: true
              subPath: privacy_policy.md
            - mountPath: /var/www/fence/fence-config.yaml
              name: config-volume
              readOnly: true
              subPath: fence-config.yaml
            - mountPath: /var/www/fence/yaml_merge.py
              name: yaml-merge
              readOnly: true
              subPath: yaml_merge.py
            - mountPath: /var/www/fence/fence_google_app_creds_secret.json
              name: fence-google-app-creds-secret-volume
              readOnly: true
              subPath: fence_google_app_creds_secret.json
            - mountPath: /var/www/fence/fence_google_storage_creds_secret.json
              name: fence-google-storage-creds-secret-volume
              readOnly: true
              subPath: fence_google_storage_creds_secret.json
            - mountPath: /fence/keys/key/jwt_private_key.pem
              name: fence-jwt-keys
              readOnly: true
              subPath: jwt_private_key.pem
          env:
            
            - name: PGHOST
              valueFrom:
                secretKeyRef:
                  key: host
                  name: fence-dbcreds
                  optional: false
            - name: PGUSER
              valueFrom:
                secretKeyRef:
                  key: username
                  name: fence-dbcreds
                  optional: false
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  key: password
                  name: fence-dbcreds
                  optional: false
            - name: PGDB
              valueFrom:
                secretKeyRef:
                  key: database
                  name: fence-dbcreds
                  optional: false
            - name: DB
              value: postgresql://$(PGUSER):$(PGPASSWORD)@$(PGHOST):5432/$(PGDB)
            - name: PYTHONPATH
              value: /var/www/fence
            - name: FENCE_PUBLIC_CONFIG
              valueFrom:
                configMapKeyRef:
                  key: fence-config-public.yaml
                  name: manifest-fence
                  optional: true

Follow basic instruction from readme but still not able to deploy

I am new to gen3 and followed instruction to deploy the most basic one using helm. I installed ranch desktop and postgreSQL@13, but stll cannot reach website at https://localhost/. Here is my values.yaml:

global:
  hostname: localhost
  # hostname: example-commons.com

# fence: 
  FENCE_CONFIG:
    # Any fence-config overrides here. ]


arborist:
  postgres:
    dbCreate: true
    username: gen3_arborist
    password: gen3_arborist

And when I run helm upgrade --install gen3 gen3/gen3 -f ./values.yaml and check the status of pod, it seem like some pod is not working or pend:

kubectl get pod
NAME                                              READY   STATUS             RESTARTS          AGE
wts-deployment-57ff756898-hc6tz                   0/1     Pending            0                 2d17h
fence-deployment-6f8489fb88-v4xlp                 0/1     Pending            0                 2d17h
portal-deployment-6c7f86d4f8-dkvbn                0/1     Pending            0                 2d17h
hatchery-deployment-7fcb68fb65-phpf7              1/1     Running            0                 2d17h
wts-oidc-job-52gzp                                0/2     Init:0/1           0                 2d17h
revproxy-deployment-9764957cd-cl5n5               1/1     Running            0                 2d17h
gen3-postgresql-0                                 1/1     Running            0                 2d17h
sower-85597bddbf-j5s86                            1/1     Running            0                 2d17h
manifestservice-deployment-6c74479448-tnrbs       1/1     Running            0                 2d17h
indexd-dbcreate-d86wl                             0/1     Completed          0                 2d17h
metadata-dbcreate-qvb88                           0/1     Completed          0                 2d17h
audit-dbcreate-k6gkf                              0/1     Completed          0                 2d17h
peregrine-dbcreate-rj5xf                          0/1     Completed          0                 2d17h
wts-dbcreate-twg8k                                0/1     Completed          0                 2d17h
arborist-dbcreate-ghqzv                           0/1     Completed          0                 2d17h
sheepdog-dbcreate-m6v8g                           0/1     Completed          0                 2d17h
fence-dbcreate-95kgz                              0/1     Completed          0                 2d17h
gen3-elasticsearch-master-0                       1/1     Running            0                 2d17h
arborist-deployment-77645d555-ch4tv               1/1     Running            0                 2d17h
indexd-deployment-b845d4565-flk9p                 1/1     Running            0                 2d17h
metadata-deployment-559bbdd459-86ptb              1/1     Running            0                 2d17h
audit-deployment-54c96847c8-696x8                 1/1     Running            0                 2d17h
indexd-userdb-4lnc7                               0/1     Completed          0                 2d17h
presigned-url-fence-deployment-6d657f9cfd-kbrsf   1/1     Running            0                 2d17h
ambassador-deployment-6cd65d48d6-dljsz            0/1     Running            52 (26h ago)      2d17h
sheepdog-deployment-746959d756-l27vz              0/1     CrashLoopBackOff   337 (3m32s ago)   2d17h
argo-wrapper-deployment-85f5d4b756-mv2l5          0/1     ImagePullBackOff   0                 2d17h
peregrine-deployment-6d9b6b584b-9dgnn             0/1     CrashLoopBackOff   222 (22s ago)     2d17h
pidgin-deployment-b9c7c5b7d-rdgzs                 0/1     CrashLoopBackOff   219 (21s ago)     2d17h

And here is some logs for crashed pods:

(base) FDSIT-7000-M7:config minghao.zhou$ kubectl logs pidgin-deployment-b9c7c5b7d-rdgzs
Got configuration:
GEN3_DEBUG=False
GEN3_UWSGI_TIMEOUT=45s
GEN3_DRYRUN=False
Running update-ca-certificates
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
Running mkdir -p /var/run/gen3
Running nginx -g daemon off;
Running uwsgi --ini /etc/uwsgi/uwsgi.ini
[uWSGI] getting INI configuration from /app/uwsgi.ini
[uWSGI] getting INI configuration from /etc/uwsgi/uwsgi.ini
open("./python3_plugin.so"): No such file or directory [core/utils.c line 3732]
!!! UNABLE to load uWSGI plugin: ./python3_plugin.so: cannot open shared object file: No such file or directory !!!
*** Starting uWSGI 2.0.20 (64bit) on [Mon Apr  1 16:29:58 2024] ***
compiled with version: 8.3.0 on 14 March 2022 16:22:45
os: Linux-6.6.14-0-virt #1-Alpine SMP Fri, 26 Jan 2024 11:08:07 +0000
nodename: pidgin-deployment-b9c7c5b7d-rdgzs
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /var/www/pidgin
detected binary path: /usr/local/bin/uwsgi
chdir() to /pidgin/
your memory page size is 4096 bytes
detected max file descriptor number: 1048576
lock engine: pthread robust mutexes
!!! it looks like your kernel does not support pthread robust mutexes !!!
!!! falling back to standard pthread mutexes !!!
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to UNIX address /var/run/gen3/uwsgi.sock fd 3
setgid() to 102
set additional group 101 (ssh)
setuid() to 102
Python version: 3.9.10 (main, Mar  2 2022, 04:40:14)  [GCC 8.3.0]
2024/04/01 16:29:58 [notice] 2011#2011: using the "epoll" event method
2024/04/01 16:29:58 [notice] 2011#2011: nginx/1.21.1
2024/04/01 16:29:58 [notice] 2011#2011: built by gcc 8.3.0 (Debian 8.3.0-6) 
2024/04/01 16:29:58 [notice] 2011#2011: OS: Linux 6.6.14-0-virt
2024/04/01 16:29:58 [notice] 2011#2011: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/04/01 16:29:58 [notice] 2011#2011: start worker processes
2024/04/01 16:29:58 [notice] 2011#2011: start worker process 2017
2024/04/01 16:29:58 [emerg] 2017#2017: io_setup() failed (38: Function not implemented)
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x4000171d60
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 304776 bytes (297 KB) for 2 cores
*** Operational MODE: preforking ***
added /usr/local/lib/python3.9/site-packages/ to pythonpath.
WSGI app 0 (mountpoint='') ready in 1 seconds on interpreter 0x4000171d60 pid: 2013 (default app)
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 2013)
spawned uWSGI worker 1 (pid: 2019, cores: 1)
spawned uWSGI worker 2 (pid: 2021, cores: 1)
[2024-04-01 16:30:08,527][pidgin.app][  ERROR] Peregrine not available; returning unhealthy
[2024-04-01 16:30:18,526][pidgin.app][  ERROR] Peregrine not available; returning unhealthy
(base) FDSIT-7000-M7:config minghao.zhou$ kubectl logs peregrine-deployment-6d9b6b584b-9dgnn
Got configuration:
GEN3_DEBUG=False
GEN3_UWSGI_TIMEOUT=600
GEN3_DRYRUN=False
Running update-ca-certificates
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
Running mkdir -p /var/run/gen3
Running nginx -g daemon off;
Running uwsgi --ini /etc/uwsgi/uwsgi.ini
[uWSGI] getting INI configuration from /app/uwsgi.ini
[uWSGI] getting INI configuration from /etc/uwsgi/uwsgi.ini
open("./python3_plugin.so"): No such file or directory [core/utils.c line 3732]
!!! UNABLE to load uWSGI plugin: ./python3_plugin.so: cannot open shared object file: No such file or directory !!!
*** Starting uWSGI 2.0.20 (64bit) on [Mon Apr  1 16:29:47 2024] ***
compiled with version: 8.3.0 on 11 November 2021 18:25:57
os: Linux-6.6.14-0-virt #1-Alpine SMP Fri, 26 Jan 2024 11:08:07 +0000
nodename: peregrine-deployment-6d9b6b584b-9dgnn
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /var/www/peregrine
detected binary path: /usr/local/bin/uwsgi
your memory page size is 4096 bytes
detected max file descriptor number: 1048576
lock engine: pthread robust mutexes
!!! it looks like your kernel does not support pthread robust mutexes !!!
!!! falling back to standard pthread mutexes !!!
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to UNIX address /var/run/gen3/uwsgi.sock fd 3
setgid() to 102
set additional group 101 (ssh)
setuid() to 102
Python version: 3.6.15 (default, Oct 13 2021, 09:49:57)  [GCC 8.3.0]
2024/04/01 16:29:47 [notice] 2011#2011: using the "epoll" event method
2024/04/01 16:29:47 [notice] 2011#2011: nginx/1.21.1
2024/04/01 16:29:47 [notice] 2011#2011: built by gcc 8.3.0 (Debian 8.3.0-6) 
2024/04/01 16:29:47 [notice] 2011#2011: OS: Linux 6.6.14-0-virt
2024/04/01 16:29:47 [notice] 2011#2011: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/04/01 16:29:47 [notice] 2011#2011: start worker processes
2024/04/01 16:29:47 [notice] 2011#2011: start worker process 2017
2024/04/01 16:29:47 [emerg] 2017#2017: io_setup() failed (38: Function not implemented)
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x4000172420
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 45 seconds
mapped 304776 bytes (297 KB) for 2 cores
*** Operational MODE: preforking ***
added /var/www/peregrine/ to pythonpath.
added /peregrine/ to pythonpath.
added /usr/local/lib/python3.6/site-packages/ to pythonpath.
failed to open python file /var/www/peregrine/wsgi.py
unable to load app 0 (mountpoint='') (callable not found or import error)
*** no app loaded. going in full dynamic mode ***
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 2013)
spawned uWSGI worker 1 (pid: 2019, cores: 1)
spawned uWSGI worker 2 (pid: 2021, cores: 1)
--- no python application found, check your startup logs for errors ---
--- no python application found, check your startup logs for errors ---
--- no python application found, check your startup logs for errors ---
(base) FDSIT-7000-M7:config minghao.zhou$ kubectl logs argo-wrapper-deployment-85f5d4b756-mv2l5
Error from server (BadRequest): container "argo-wrapper" in pod "argo-wrapper-deployment-85f5d4b756-mv2l5" is waiting to start: trying and failing to pull image
(base) FDSIT-7000-M7:config minghao.zhou$ kubectl logs sheepdog-deployment-746959d756-l27vz 
Defaulted container "sheepdog" out of: sheepdog, sheepdog-init (init)

Any suggestions on this?

And when I reinstall gen3 and get logs from "default/portal-deployment-6c7f86d4f8-brqjn" like this:

                                                                                                                                                                                                                   │
│ > [email protected] schema                                                                                                                                                                                        │
│ > node ./data/getSchema                                                                                                                                                                                            │
│                                                                                                                                                                                                                    │
│ Fetching http://revproxy-service/api/v0/submission/getschema                                                                                                                                                       │
│ Fetching http://revproxy-service/api/v0/submission/_dictionary/_all                                                                                                                                                │
│ failed fetch - non-200 from server: 502, sleeping 2214 then retry http://revproxy-service/api/v0/submission/getschema                                                                                              │
│ failed fetch - non-200 from server: 502, sleeping 2922 then retry http://revproxy-service/api/v0/submission/_dictionary/_all                                                                                       │
│ Retrying http://revproxy-service/api/v0/submission/getschema after sleep - 1                                                                                                                                       │
│ Re-fetching http://revproxy-service/api/v0/submission/getschema - retry no 1                                                                                                                                       │
│ failed fetch - non-200 from server: 502, sleeping 4652 then retry http://revproxy-service/api/v0/submission/getschema                                                                                              │
│ Retrying http://revproxy-service/api/v0/submission/_dictionary/_all after sleep - 1                                                                                                                                │
│ Re-fetching http://revproxy-service/api/v0/submission/_dictionary/_all - retry no 1                                                                                                                                │
│ failed fetch - non-200 from server: 502, sleeping 4409 then retry http://revproxy-service/api/v0/submission/_dictionary/_all                                                                                       │
│ Retrying http://revproxy-service/api/v0/submission/getschema after sleep - 2                                                                                                                                       │
│ Re-fetching http://revproxy-service/api/v0/submission/getschema - retry no 2                                                                                                                                       │
│ failed fetch - non-200 from server: 502, sleeping 9668 then retry http://revproxy-service/api/v0/submission/getschema                                                                                              │
│ Retrying http://revproxy-service/api/v0/submission/_dictionary/_all after sleep - 2                                                                                                                                │
│ Re-fetching http://revproxy-service/api/v0/submission/_dictionary/_all - retry no 2                                                                                                                                │
│ failed fetch - non-200 from server: 502, sleeping 8286 then retry http://revproxy-service/api/v0/submission/_dictionary/_all                                                                                       │
│ Retrying http://revproxy-service/api/v0/submission/_dictionary/_all after sleep - 3                                                                                                                                │
│ Re-fetching http://revproxy-service/api/v0/submission/_dictionary/_all - retry no 3                                                                                                                                │
│ failed fetch - non-200 from server: 502, sleeping 16252 then retry http://revproxy-service/api/v0/submission/_dictionary/_all                                                                                      │
│ Retrying http://revproxy-service/api/v0/submission/getschema after sleep - 3                                                                                                                                       │
│ Re-fetching http://revproxy-service/api/v0/submission/getschema - retry no 3                                                                                                                                       │
│ failed fetch - non-200 from server: 502, sleeping 17910 then retry http://revproxy-service/api/v0/submission/getschema                                                                                             │
│ Retrying http://revproxy-service/api/v0/submission/_dictionary/_all after sleep - 4                                                                                                                                │
│ Re-fetching http://revproxy-service/api/v0/submission/_dictionary/_all - retry no 4                                                                                                                                │
│ failed fetch - non-200 from server: 502, sleeping 16509 then retry http://revproxy-service/api/v0/submission/_dictionary/_all                                                                                      │
│ Retrying http://revproxy-service/api/v0/submission/getschema after sleep - 4                                                                                                                                       │
│ Re-fetching http://revproxy-service/api/v0/submission/getschema - retry no 4                                                                                                                                       │
│ failed fetch - non-200 from server: 502, sleeping 17043 then retry http://revproxy-service/api/v0/submission/getschema                                                                                             │
│ Retrying http://revproxy-service/api/v0/submission/_dictionary/_all after sleep - 5                                                                                                                                │
│ Re-fetching http://revproxy-service/api/v0/submission/_dictionary/_all - retry no 5                                                                                                                                │
│ Error:  failed fetch non-200 from server: 502, max retries 4 exceeded for http://revproxy-service/api/v0/submission/_dictionary/_all                                                                               │
│ Stream closed EOF for default/portal-deployment-6c7f86d4f8-brqjn (portal)

Duplicate keys in sheepdog deployment yaml

There is a duplicate env var key in helm/sheepdog/templates/deployment.yaml which is causing template rendering issues

error:

W0104 17:44:04.431345    8988 warnings.go:70] spec.template.spec.containers[0].env[15].name: duplicate name "FENCE_URL"

snippet:

            - name: FENCE_URL
              valueFrom:
                configMapKeyRef:
                  name: manifest-global
                  key: fence_url
                  optional: true
            - name: INDEXD_PASS
              valueFrom:
                secretKeyRef:
                  name: indexd-service-creds
                  key: gdcapi
                  optional: false
            - name: GEN3_UWSGI_TIMEOUT
              value: "600"
            - name: DICTIONARY_URL
              value: {{ include "sheepdog.dictionaryUrl" .}}
            {{- with .Values.indexdUrl }}
            - name: INDEX_CLIENT_HOST
              value: {{ . }}
            {{- end }}
            {{- with .Values.fenceUrl }}
            - name: FENCE_URL
              value: {{ . }}
            {{- end }}

[Error] Unable to deploy after several revisions - Kubernetes secret can hold 1048576 bytes at most

Description

After several revisions have been deployed, I will eventually hit a wall that prevents further deployments until I tear down the chart and reinstall it. I can potentially placate this by providing a --history-max value to the upgrade command.

Potential Causes/Targets

I think this could be affected by how large configurations are supplied to the deployments (ie. fence.USER_YAML and portal.gitops.json). These large configurations are currently supplied directly to the values interface whereas they could potentially be handled as a separate file that is mounted into a deployment's container instead.
I also read that this error is oftentimes resolved by ensuring appropriate configurations of the chart's .helmignore file(s).

Error Msg

Error: UPGRADE FAILED: create: failed to create: Secret "sh.helm.release.v1.gen3-development.v35" is invalid: data: Too long: must have at most 1048576 bytes

Configure securityContext for containers

In order to improve the security of the containers, as described in the references below, request that consideration be given to configuring the securityContext to specify a user other than root. This would help compartmentalize the impact if a container were compromised by mitigating potential container escape vulnerabilities.

uc-cdis / gen3-helm Goto Github PK

gen3-helm's Issues

Issue Encountered

Logs and Info

Possible Fix

Reference

Issue Encountered

Logs and Info

Current Fix

Description

Potential Causes/Targets

Error Msg

Recommend Projects

Recommend Topics

Recommend Org

Jobs