rht-labs / enablement-framework Goto Github PK

View Code? Open in Web Editor NEW

10.0 18.0 33.0 19.02 MB

Smarty 24.97% Dockerfile 51.38% Shell 23.65%

enablement-framework's Introduction

enablement-framework

This repository contains the components needed to run a TL500 enablement session.

Tooling - The required tools to deploy once a cluster is available
Helm Releases - The above tooling made available as Helm Releases

enablement-framework's People

Contributors

Stargazers

Watchers

enablement-framework's Issues

Tooling: Gitlab LDAP email attribute not assigning an email to users on first login with LDAP preventing Web UI use

First login via LDAP for a user does not populate the email field. This does not seem to be resolved by using the attribute configuration settings (https://docs.gitlab.com/ee/administration/auth/ldap/#attribute-configuration-settings). As confirmed via ldapsearch, the logged-in LDAP user has a populated mail field prior to the initial installation of Gitlab and first login of this user. It appears that other attributes are being successfully consume as the first name, lastname, and UID seem to be properly set.

This repository currently deploys Gitlab CE 12.8.7.

Security work was done around this feature in Gitlab 12.8.0 to prevent users from filling in their own email address in Gitlab 12.8 (change below).
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/24049

This read-only change caused issues where users without a populated LDAP entry could not use the Web UI and was reverted in this change (effective Gitlab 12.10.1):
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/28541

Deploying Gitlab CE 12.10.1 confirms that it is now possible to change the users email address again, but does not resolve the initial sync issue.

"subscriptions" CRD short name causing conflicts

In all of the scripts etc. we should be using full CRD names. In one particular case facilitator installed CRD that had a short name of subscriptions, same as subscriptions.operators.coreos.com that we use.

Tooling: Fix Gitlab LDAP Lookup to not target specific index

Currently when allowing the gitlab chart to lookup your LDAP values against your configured cluster, it expects the LDAP provider to be in the first position. This should be fixed so that it's a bit more dynamic and can search for the ldap provider.

🐈 Make Pet Battle deployment part of this chart

As per the feedback / convo with Matt and co the other day. It would be great to have Pet Battle deploy to it's own namespace on each cluster deployment. This can give the attendees time to play with it and explore while we're setting the scene with the pet-battle lightning talk

FYI - @jfilipcz / @eformat / @mtakane / @ckavili / @oybed

stack tl500 image has not include java 17

The latest version of pet-battle-api is defined to compile with Java 17, however the current stack to be used in CRW does not include that version, only java 11:

https://github.com/rht-labs/enablement-framework/blob/main/codereadyworkspaces/stack/Dockerfile#L29

If a developer executes a maven command (e.g.: mvn test) fails with the following exception:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project pet-battle-api: Fatal error compiling: error: invalid target release: 17 -> [Help 1]

The stack should include the same tools defined by the pet-battle application.

🐂 monitoring rbac fails for student user 🐂

Getting this error when trying the monitoring section as a student

https://rht-labs.com/tech-exercise/#/4-return-of-the-monitoring/1-enable-monitoring

Step 1.

$ oc get servicemonitor -n ${TEAM_NAME}-test -o yaml

Error from server (Forbidden): servicemonitors.monitoring.coreos.com is forbidden: User "mike" cannot list resource "servicemonitors" in API group "monitoring.coreos.com" in the namespace "ateam-test"

Seems we need this applied in the rbac

cat <<EOF | oc apply -f-
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: tl500-monitoring-edit
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: monitoring-edit
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: Group
  name: tl500-users
EOF

🧨 install Chart is NOT idempotent 🧨

you can install, uninstall this chart.

but if you have a failed install, then try to reinstall i.e. run this multiple times:

helm upgrade --install do500 . --namespace do500 --create-namespace --timeout=15m --set group_name=lodestar-developers

then gitlab deployments DOES NOT redeploy properly, making a right royal mess. you have to delete do500-gitlab namespace, let that install from fresh.

also, the stackrox and crw operators finalizers are still dodgy ... causing you to have to patch finalizers and remove objects when trying to cleanly uninstall.

Add ability to deploy autoscaler

Pin Operators To Specific Versions

Mentioned this in #32 but we can loop this in at a later point:

To avoid the workaround, is there a possibility of maybe just pinning CRW to a "healthy version"? I haven't looked too much at the problem, but wherever we end up we probably want to avoid having the operator upgrade things mid-run. So would probably be good to just pick a point to pin to once we land on something that works.

cc/ @springdo @eformat @ckavili

TL500 chart fails on StackRox deployment

TL500 Chart (v3.0.4) is failing with:

client.go:519: [debug] Add/Modify event for configure-stackrox-integration: MODIFIED
upgrade.go:369: [debug] warning: Upgrade "tl500" failed: post-upgrade hooks failed: job failed: BackoffLimitExceeded
Error: UPGRADE FAILED: post-upgrade hooks failed: job failed: BackoffLimitExceeded
helm.go:81: [debug] post-upgrade hooks failed: job failed: BackoffLimitExceeded
UPGRADE FAILED

I was not able to pull logs from the actual job/pod but I will give it another try

values related to stackrox that were used at deployment:

stackrox-chart:
  enabled: true
  stackrox:
    clusterName: tl500
    namespace: stackrox

Fix npm permission for CRW image

npm commands are not working due to the ownership of the home directory. We need to fix it.

Make tl500-base Cert-Utils operator installation optional

In some cases, chart deployment can be halted due to other Cert-Utils operator instance being already present in target environment. To prevent that, it would be great to have a way of switching off Cert-Utils installation.

deploy cluster logging operator....

used by this rht-labs/tech-exercise#17

we can probs deploy this as ephemeral as it;s just a training course. otherwise, logs be filled VERY FAST !!

Create GHA workflow to handle tagging and active tag sorting.

Tooling - Add DevFile for creation of customstack

Right now this lives in a seperate branch. Pull this in so that the devfile can live along with the rest of the deployment

ADD - Sealed Secrets to the setup

For v3 of the DO500 Tech Content, Sealed secrets should be deployed once. It is a cluster wide controller so having individual learners deploy it will cause issues.

HMW have more confident in cluster updates? ☘️

As per @eformat 's #93 (comment) , I'm creating this issue to discuss what would be an ideal solution to try in order to have more confidence in the tech excersises with cluster and operators updates in an automated way? :)

[enhancement] operators all have their own namespaces

If possible, lets see if we can deploy all operators to their own namespaces.

That way, if a breaking change happens in an operator (as it did with gitops operator 1.7.0) we could pin it to a Manual installed version very easily.

With many operators sharing the openshift-operator namespace, becomes a lot harder to do this as every operator in the namespace would need to be Manual if one is (linked to the OG/OLM)

The dependency here will be if individual operators support this capability (many do).

Update forks to point to upstream repos

We're currently pointing to some personal forks of enablement-ci-cd and todolist. This is just a reminder to ourselves to come and fix this once we fix the appropriate upstream issues.

Add ldap bind password lookup to chart

We should add the lookup of the bind password value to the gitlab helper functions as this tends to cause an issue with user lookup if not set appropriately.

Tooling - Gitlab needs to internally generate HTTPS prefixed URL's rather tha HTTP URLs

Gitlab currently internally generates HTTP prefixed URL's. Although the Gitlab route redirects these to HTTPS endpoints, modern browsers may display warnings to user that the site is insecure in the brief moment during the handshake where prior to this redirection taking place.

Changing these values in the deployments.yaml GITLAB_OMNIBUS_CONFIG should resolve this issue.

external_url "https://{{ $.Values.gitlab_app_name }}.{{ include "do500.app_domain" . }}";
nginx['listen_port']=80;
nginx['listen_https']=false;

Not able to deploy dev workspaces

Seems like there is an issue with the last image available for the pluginregistry image.

jdk was updated from 11.0.19 to 11.0.20 and appears an error:

java.lang.Error: java.io.FileNotFoundException: /usr/lib/jvm/java-11-openjdk-11.0.20.0.8-2.el8.x86_64/lib/tzdb.dat (No such file or directory)

You can override main/tooling/charts/tl500-course-content/templates/crw/crwv2.yaml

to:
... components: pluginRegistry: deployment: containers: - image: >- registry.redhat.io/devspaces/pluginregistry-rhel8@sha256:1be5c836fb2531475f07f48153d4b8c3db84fb7281c2cd54844b9037b0a526d5 name: plugin-registry

    Updating this will solve the issue and will deploy the dev workspaces.

#168

Hard to debug due to lookup function

This chart can be a bit unwieldy to debug due to the use of the lookup function (as noted in #10). We should update the helper functions to substitute in dummy values if it is run with helm template

Add install order

Occasionally the helm install will fail because the CheCluster is applied before the CRD is available. Need to update the chart to be smart enough to apply in the appropriate order.

🐈‍⬛ GitLab web hooks error

in the latest cluster for testing, seems we get "Requests to the local network are not allowed" when trying to add webhooks:

we can work around it for now using a Setting in Gitlab - Admin Menu > Settings > Network

"Allow requests to the local network from web hooks and services"

but this was not necessary for other clusters, so needs investigating?

Nexus not populated with Labs NPM Resources - No Error - Jenkins Fails

Jenkins exercise will fail with strange issues. Typically because it can't find certain artifacts. For example ...

npm ERR! 404 Repository not found - GET http://nexus:8081/repository/labs-npm/zone.js/-/zone.js-0.11.4.tgz
npm ERR! 404 
npm ERR! 404  'zone.js@http://nexus:8081/repository/labs-npm/zone.js/-/zone.js-0.11.4.tgz' is not in this registry.
npm ERR! 404 
npm ERR! 404 Note that you can also install from a
npm ERR! 404 tarball, folder, http url, or git url.

This is related to the Nexus build and deployment from earlier. Nexus reports back fine with ArgoCD and OpenShift and Nexus appears to be working. However, when looking at Nexus, all the repositories haven't been loaded.

Jenkins build fails ...

Add IPA Passthrough SSL

📝 Description

When TL500 participants get the email with Login info, many of them can't proceed due to Browser Security Settings. While there are ways to work around it, the user experience at this point causes unneccessary concern and confusion.

Chrome NET::ERR_CERT_AUTHORITY_INVALID
Firefox Error code: SEC_ERROR_UNKNOWN_ISSUER

🚶 Steps to reproduce

Find Login info:
Click link to https://ipa.apps.***.rht-labs.com/. (replace *** as per your server info)
See screenshots below of firefox and chrome.
This will vary depending on user's specific browser config.

🧙‍♀️ Suggested solution

Can we automate set up of SSL certificates in lodestar to prevent this from happening?

🐛 - Gitlab LDAP bindDN and base not fully qualified

when deploying the base tl500 helm chart - the ldap creds for gitlab are automatically created by looking up the OAuth identity provider

this is done in the _helpers.tpl code

BUT .. we just had an instance where the "bind_dn" and "base" in the GITLAB_OMNIBUS_CONFIG was set wrong ! here are the screenshots:

They should be:

'bind_dn' => 'uid=ldap-sa,cn=users,cn=accounts,dc=rht-labs,dc=com',
'base' => 'cn=accounts,dc=rht-labs,dc=com'

So, the lookup is failing somehow.

I have noticed similar behavior in another chart where you NEED to escape the ',' e.g.

helm upgrade --install my-chart \
--set ldap_bind_dn="uid=ldap_admin\\,cn=users\\,cn=accounts\\,dc=redhatlabs\\,dc=dev"

Anyway .. this needs looking into root cause. It doesn't happen in every deployment, this is the first i have seen it in the real !

rht-labs / enablement-framework Goto Github PK

enablement-framework's Introduction

enablement-framework

enablement-framework's People

Contributors

Stargazers

Watchers

Forkers

enablement-framework's Issues

📝 Description

🚶 Steps to reproduce

🧙‍♀️ Suggested solution

Recommend Projects

Recommend Topics

Recommend Org

Jobs