This repository contains tools, scripts, docs and anything else that might be helpful to use as part of a UA handover.
See CONTRIBUTING.md for info on how to contribute.
Vault channel is specified in the bundle, but the check fails
ua-bundle-checks.openstack (1).log
vault:
bindings:
? ''
: oam-space
certificates: internal-space
etcd: internal-space
secrets: internal-space
shared-db: internal-space
channel: 1.8/stable
charm: vault
num_units: 3
series: jammy
to:
- '10'
- '11'
- '9'
Will send the full bundle privately
if the path provided to --bundle is not a path and the file is not found all checks are ignored but no error is raised.
Currently we have the following "VM test extras" (if applicable)
I think we should add further network tests
I know this will give the FEs more things to test, but this will ensure we catch any issues early
Especially with Kubernetes, we are using mysql-innodb-cluster only for Vault storage backend. The default value is actually sufficient.
../ua-reviewkit/juju/ua-bundle-check.py --bundle ./generated/kubernetes/bundle.yaml
=> application 'mysql-innodb-cluster'
[PASS] HA (>=3)
[PASS] max-connections (value=2000)
[FAIL] innodb-buffer-pool-size (value=268435456, expected=6442450944)
https://github.com/canonical/ua-reviewkit/blob/main/juju/checks/openstack.yaml#L35
In order to have a different assertion for kubernetes deployments we should add one to checks/kubernetes.yaml
I understand that now the development of ua-reviewkit is done on Github instead of Launchpad. But git.launchpad.net still has content and hard to notice it's deprecated.
Can we add one commit to the git by deleting everything but leaving one README file saying we should refer to https://github.com/canonical/ua-reviewkit from now on? So we can avoid confusions.
https://git.launchpad.net/ua-reviewkit/tree/
P.S. I found it by one handover still using git.launchpad.net as the source of some tests.
charmstore versions of the charm had the "channel" config set to "stable" which results in whatever the default snap version has at the time of deployment being installed. Newer versions of the charm pin to 1.7/stable, 1.8/stable etc.
We should throw an error when we see vault charms deployed with channel=stable.
I tried to run the kubernetes-extra-checks.py script and my script got stuck trying to pull the sonobuoy image from projects.registry.vmware.com/sonobuoy/sonobuoy.
The pod description reported the following errors:
Normal Scheduled 44s default-scheduler Successfully assigned sonobuoy/sonobuoy to juju-29bbea-kubernetes-10
Normal BackOff 14s (x2 over 42s) kubelet Back-off pulling image "projects.registry.vmware.com/sonobuoy/sonobuoy:v0.56.3"
Warning Failed 14s (x2 over 42s) kubelet Error: ImagePullBackOff
Normal Pulling 3s (x3 over 43s) kubelet Pulling image "projects.registry.vmware.com/sonobuoy/sonobuoy:v0.56.3"
Warning Failed 3s (x3 over 43s) kubelet Failed to pull image "projects.registry.vmware.com/sonobuoy/sonobuoy:v0.56.3": rpc error: code = NotFound desc = failed to pull and unpack image "projects.registry.vmware.com/sonobuoy/sonobuoy:v0.56.3": failed to resolve reference "projects.registry.vmware.com/sonobuoy/sonobuoy:v0.56.3": projects.registry.vmware.com/sonobuoy/sonobuoy:v0.56.3: not found
Warning Failed 3s (x3 over 43s) kubelet Error: ErrImagePull
I was able to successfully run it though by not specifying an image for sonobuoy in the line
./sonobuoy run --kube-conformance-image=${SONOBUOY_CONFORMANCE_IMAGE} --mode=${SONOBUOY_MODE} --skip-preflight --plugin e2e --e2e-parallel ${SONOBUOY_PARALLEL} --wait 2>&1
as it pulls from docker sonobuoy/sonobuoy by default.
We currently use fio but this isn't done for smaller deployments and/or ones with no RBD workload.
It'd be useful to still have some baseline data. We can probably do "rados bench" at least.
Enhancement to add checks for ldap connection timeouts.
charm: keystone-ldap
config param: ldap-config-flags
The value of ldap-config-flags is json string.
So regex need to be performed for this check.
(This will require a new assertion schema grep/contains in ua-bundlechecks)
Conditions to check:
If use_pool
is true then pool_connection_timeout
have to be set
If use_pool
is false or not set then connection_timeout
have to be set.
Note: the params use_pool, pool_connection_timeout, connection_timeout are part of ldap-config-flags value.
If a bundle contains cross-model relations then the checker bombs out as below:
================================================================================
UA Juju bundle config verification
* 2024-06-10 12:57:59.159130
* type=openstack
* bundle=/home/alejandro/Downloads/juju_bundle.yaml
* bundle_sha1=266f88723064211f8bb5e964794a969667f58cf5
* assertions_sha1=43de655a3beb2f75bade815786cfa88e04a16e19
================================================================================
ERROR: Error parsing the bundle file: expected a single document in the stream
in "<unicode string>", line 1, column 1:
series: focal
^
but found another document
in "<unicode string>", line 4533, column 1:
--- # overlay.yaml
^
Please check the above errors and run **again.**
I'd say we should either handle this in the checker or clean up the bundle somehow
When overlays (e.g. LMA offers) are present in the exported bundle, ua-bundle-check fails with the below error:
$ ./ua-bundle-check.py --bundle ../../exported-bundle.yaml
================================================================================
UA Juju bundle config verification
* 2022-09-02 13:19:46.272974
* type=openstack
* bundle=../../exported-bundle.yaml
* bundle_sha1=<removed>
* assertions_sha1=<removed>
================================================================================
ERROR: Error parsing the bundle file: expected a single document in the stream
in "<unicode string>", line 1, column 1:
series: focal
^
but found another document
in "<unicode string>", line 3758, column 1:
--- # overlay.yaml
^
Please check the above errors and run again.
Overlay in the exported bundle:
--- # overlay.yaml
applications:
logstash-server:
offers:
logstash-beat:
endpoints:
- beat
acl:
admin: admin
nagios:
offers:
nagios-monitors:
endpoints:
- monitors
acl:
admin: admin
prometheus:
offers:
prometheus-target:
endpoints:
- target
acl:
admin: admin
At this point, the script and kubernetes-extra-checks.sh are focused on listing the failed tests. It would be nice to output the summary of the test such as those lines in the generate tarball.
plugins/e2e/results/global/junit_01.xml:
plugins/e2e/results/global/e2e.log:SUCCESS! -- 337 Passed | 0 Failed | 0 Pending | 5433 Skipped
[current output]
After running the ua-reviewkit tests for k8s, the tests results are not saved to the same folder. I tried to find the results, but other than what was pasted on the screen, I did not find them. I recall seeing a text saying that the results were savec under /tmp, but I did not find them there neither. Adding functionality to prescribe the location of the test results would be a great addition.
Newer deployments are now splitting the applications, flags and placement into multiple bundle files. As a result some of the checks fail because the data is not in the base bundle.yaml.
We need to figure out how to parse the aggregate bundle for the existing checks.
Example failures:
[WARN] global-physnet-mtu (value=1550, expected=9000)
in: overlay-openstack-options.yaml
[FAIL] dns-servers (not found)
also in: overlay-openstack-options.yaml
however its not limited to just this overlay-openstack-options.yaml bundle. There are now multiple different bundle files and I am told it's not standardised for each person doing deployments which bundles are used for what.
Overlays in this deployment:
overlay-additional-applications.yaml
overlay-hostnames.yaml
overlay_lma-offers.yaml
overlay-openstack-options.yaml
overlay_openstack-saas.yaml
overlay-openstack-ssl.yaml
overlay-removed-applications.yaml
overlay-service-placement.yaml
overlay-vips.yaml
Additonally: the LMA applications are moved to the 'lma' model and lma.yaml file.
This is because of a known bug in that charm that causes the charm to skip discard config if bcache devices are in use.
charm bug - https://bugs.launchpad.net/charm-ceph-osd/+bug/1872665
When charms are deployed from the charmhub they must not use the latest/stable channel as this is unsupported and can contain unexpected charms - it was originally set to be the last version found in the charmstore (cs:). Instead charms should use a specific track/channel according to https://docs.openstack.org/charm-guide/latest/project/charm-delivery.html
We should therefore warn when we see that charms are using latest/stable.
Dave C reported that 0.18 no longer works.
ubuntu@iadaz01sinf01:~/dave/handover/ua-reviewkit/kubernetes$ diff -u kubernetes-extra-checks.sh.orig kubernetes-extra-checks.sh
--- kubernetes-extra-checks.sh.orig 2021-02-11 19:43:44.405074951 +0000
+++ kubernetes-extra-checks.sh 2021-02-11 19:11:53.370554415 +0000
@@ -1,6 +1,7 @@
#!/bin/bash -ex
-SONOBUOY_VERSION=${SONOBUOY_VERSION:-0.18.0}
+#SONOBUOY_VERSION=${SONOBUOY_VERSION:-0.18.0}
+SONOBUOY_VERSION=${SONOBUOY_VERSION:-0.20.0}
SONOBUOY_PARALLEL=${SONOBUOY_PARALLEL:-30}
function fetch_sonobuoy() {
@@ -15,7 +16,8 @@
fi
./sonobuoy delete --all || true
kubernetes/README.md has a good instruction how to run Sonobuoy. However, there seems some changes in the upstream Sonobuoy release manifest and some of the content may not be applicable any longer.
Would be nice those instructions are updated to follow the changes:
https://sonobuoy.io/decoupling-sonobuoy-and-kubernetes/
[kubernetes/README.md]
Sonobuoy depends on kubernetes version that is being used.
As per documentation, each version of sonobuoy will cover that
same k8s version and two older versions (e.g. v0.14.X covers
k8s 1.14, 1.13 and 1.12).
...
Based on that version, check out which is the corresponding
sonobuoy available on:
https://github.com/vmware-tanzu/sonobuoy/releases/Once the version was found, run the following command, as
the example below:$ SONOBUOY_VERSION=0.19.0 ./kubernetes-extra-checks.sh
An export-bundle from a recent deployment references all charms as local:-0 e.g. local:ceilometer-agent-0.
The osm charms have evolved since the existing checks were written so need updating to reflect current state.
we have a customer K8S which is deployed on top of openstack. Customer compained persistent volume claims stuck in Pending State.
The root cause is that openstack cinder uses one AZ named "nova" while nova-compute uses zone1, zone2 and zone3 causing mismatch.
We ran sonobuoy validation test suite before delivering the K8s cluster and all passed, it seems sonobuoy doesn't pick up the issue.
With Kubernetes 1.24 deployed with Juju/MAAS on baremetal
default way of launching sonobuoy like this:
./sonobuoy run
Fails with the following result:
$ ./sonobuoy results $results
Plugin: e2e
Status: failed
Total: 1
Passed: 0
Failed: 1
Skipped: 0
from the e2e log:
Jan 12 14:50:28.313: INFO: ==== node wait: 3 out of 6 nodes are ready, max notReady allowed 0. Need 3 more before starting.
Jan 12 14:50:58.311: INFO: Unschedulable nodes= 3, maximum value for starting tests= 0
Jan 12 14:50:58.311: INFO: -> Node k8s-control-plane-1 [[[ Ready=true, Network(available)=false, Taints=[{juju.is/kubernetes-control-plane true NoSchedule }], NonblockingTaints=node-role.kubernetes.io/control-plane,node-role.kubernetes.io/master ]]]
The workaround is to include the suggested taint to non-blocking-taints arg like this:
./sonobuoy run --plugin-env=e2e.E2E_EXTRA_ARGS=--non-blocking-taints=juju.is/kubernetes-control-plane,true,NoSchedule
So this might be a good idea to launch it the same way in kubernetes-extra-checks.sh
check that ceph-access space bindings are correct across applications
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.