Grafana helm test is failing with the below error
"NameError: name 'exception' is not defined"
This is because exception is defined in smaller case. changing
exception to Exception fixes this issue
Change-Id: I533ae822babb4f063242fee1cd42b5b821519b5f
Signed-off-by: Sreejith Punnapuzha <Sreejith.Punnapuzha@outlook.com>
This PS moves to deploy the default number of RMQ replicas in the gate.
Change-Id: I36734a64b45adce8de89dfe3b020d0dae0e66d94
Signed-off-by: Pete Birley <pete@port.direct>
This PS extends the rabbit startup locgic to ensure nodes have
actually joined the cluster on startup.
Change-Id: Ib876d9abd89209d0a7972983bdf4daacf5f8f582
Signed-off-by: Pete Birley <pete@port.direct>
This PS sets `--enable-ssl-chain-completion=false` for the MariaDB
ingress controller. This is the default for current versions of
the nginx-ingress-controller, but for 0.9.0 needs to be set.
If enableSSLChainCompletion is left on, nginx will attempt to
autocomplete SSL certificate chains with missing intermediate CA
certificates, causing unnecessary network and errors in pod logs.
Change-Id: I088b33fe994281dca6997baa87a6b599c3f10c14
Closes-Bug: #1835364
- Move the cron manifests to ceph-client chart
- Keep the script that actually does the work in Ceph-OSD
- with this PS, ceph-defragosds will be started after Ceph-Client chart
gets deployed. In the cronjob, it will exec to a running OSD pod and
execute the script.
Change-Id: I6e7f7b32572308345963728f2f884c1514ca122d
The configmap is for mariab ingress controller configuration. It is
to enable the capability of overriding default nginx configurations
in the controller.
Change-Id: I25eb8a237a6f8ad63bde725b1d4f31a928fa7c49
Signed-off-by: Yi Wang <yi.c.wang@intel.com>
This is to update helm test logic to test and exit if
there are no osds up in the cluster.
This may heppen when we miss ceph-osd label on the nodes.
Change-Id: I98971106e202a9c4fd9d236f368492c6c6498ce1
This PS adds a libvirt image based on Ubuntu Bionic for
use with the stein release of nova.
Change-Id: I8a0c524feadd79bc0632b3c4cff2f692b10633de
Signed-off-by: Pete Birley <pete@port.direct>
This adds the ability to tolerate failures of the selenium tests
in our jobs, as we intermittenly see these tests fail. The failure
of these tests should not necessarily indicate failure of the job
overall, so this change prevents exactly that
Change-Id: I4f97fad96f63d42fdb3bb5b8958dbed3dfd7dfc7
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This is to update the logic to check for incomplete pgs in ceph
cluster and proceed if there are no incomplete/inactive pgs and
will not wait for healthy ceph cluster.
Change-Id: I026d6cc378053e805680c31d75fdfb40bbb636f5
This patch fixes an issue with Postgres HA where
the PVC which stores the database was filling up with
WAL records and not deleting them due to some
misconfigurations with Postgres. Once the PVC
would fill up, replication would fail across the node
and the database would not be able to start, crashing
the system.
Specifically, archive_mode was turned on, but was not
supplied with a function through which to archive the
logs. When WAL archiving is turned on, old WAL files
cannot be removed until the system has archived them first.
However, since we never told the system how to archive the
files, it would repeatedly fail so the WAL files would
never be cleaned up.
Also in this patch are some small house keeping items:
- Lowered the wal_keep_segments drastically so Postgres
can't keep as many WAL segments around to minimize the
chance of PVC fill issues
- Turned the wal_level from 'logical' to 'hot_standby'
to keep it consistent with the fact that Patroni uses
streaming replication and not logical replication
- Removed the autovaccuum configurations as they are not
needed
Change-Id: Id48c3ee9976823b2bdb4395a029fe75476bdaa62
This adds a basic helm-toolkit snippet template for adding
kubernetes liveness and readiness probes to a container. This adds
flexibility by defining the probes contents via values overrides
wholesale
Change-Id: I0862ae59c87b8c0c4e2412030b1801bceb3e3c99
Signed-off-by: Pete Birley <pete@port.direct>
This updates the Nagios chart to include an init container for
generating the host and host group definitions Nagios requires to
function. The benefit is that Nagios does not need to constantly
attempt to update its host and host group definitions, which
currently triggers a restart of the Nagios service even in cases
where the host file hasn't changed. With the introduction of an
init container for handling this, we can also remove the service
check definition and command definition for executing the plugin
at periodic intervals
Depends-On: https://review.opendev.org/668197
Change-Id: Id1d63d8c99850b960eb352361d7796162bd6be2f
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This updates the Nagios image used to the image that is built
out of openstack-helm-images instead of the image hosted in quay.
This new image includes the updated host definition plugin that
uses the kubernetes python client instead of prometheus queries,
so the check_prometheus_hosts command has also been updated to
reflect the change in required arguments
Change-Id: If3440ca9be3227fc48cd698a7d44501e6747bb1e
Signed-off-by: Steve Wilkerson <sw5822@att.com>
The ldap overrides values file had been moved to
keystone/values_overrides[1]. This patch is to update the reference.
[1]
cede6c0d48 (diff-89208df3c46570cf56141a9353ce27a7)
Change-Id: Ib03bb979dc681a647abd36df77f55fd82e0d4df6
This is to fix static osd id logic to variable as we have an issue
in our current logic.
this is happening only when we have file backed journals and
block backed data as shown below.
ex:
storage:
osd:
- data:
type: block-logical
location: /dev/vdb
journal:
type: directory
location: /var/lib/openstack-helm/ceph/osd/journal-one
- data:
type: block-logical
location: /dev/vdc
journal:
type: directory
location: /var/lib/openstack-helm/ceph/osd/journal-two
Change-Id: I36d08b1b7aa5925831a64c03259098f6c4753c3e
This is to adjust helm test logic to proceed the deployment if 80% of
osds are up and running in the cluster .
Change-Id: I128266fd374426f75928332690e275b7f0175318
It can be that zuul_site_mirror_fqdn env variable will not be set,
in this case the whole job will fail, instead of simply not configuring
mirrors during image build. With this patch, if set_fact fails, mirrors
simply will not be configured during image build, as planned in lines 62
and 88 in this playbook
Change-Id: I049c696c7fb0d7cadb527a9f17dd01a42a671baa
Occasionally the default config can result in attempts
to bind to ipv6 which fail - so we explicity set the
host to ipv4.
Change-Id: I3c01ed0ef7c84cf779d88386c14f7c7bd2003310
Signed-off-by: Pete Birley <pete@port.direct>