A new value "rename" has been added to the Ceph pool spec to allow
pools to be renamed in a brownfield deployment. For greenfield the
pool will be created and renamed in a single deployment step, and
for a brownfield deployment in which the pool has already been
renamed previously no changes will be made to pool names.
Change-Id: I3fba88d2f94e1c7102af91f18343346a72872fde
The current pool init job only allows the finding of PGs in the
"peering" or "activating" (or active) states, but it should also
allow the other possible states that can occur while the PG
autoscaler is running ("unknown" and "creating" and "recover").
The helm test is already allowing these states, so the pool init
job is being changed to also allow them to be consistent.
Change-Id: Ib2c19a459c6a30988e3348f8d073413ed687f98b
Since k8s v1.11+, the annotation `service.alpha.kubernetes.io/tolerate-unready-endpoints` is deprecated. we should use Service.spec.publishNotReadyAddresses instead.
Change-Id: Ic4f82b8e78770ff29637937c4bcb9af71b53f8d3
The current start logic when existing cluster state is reboot can
lead to a split brain condition under certain circumstances. This
patchset adds some additional step to ensure cluster is set to
live state once leader node is ready to start, instead of relying
on slave nodes to handle. Also add some simple retry when there
is collision detected while trying to write to configmap.
The existing hair-trigger that will put the cluster state from
"live" into "reboot" can use some fine tuning, but updating it
properly should require additional investigation and testing,
hence should be done as a separate activity outside the scope
of this patchset.
Change-Id: Ieb2861d6fbc435e24e20d13c7b358c751890b4c4
This is to update python3 for checkObjectReplication.py script
since python2 got removed from ceph images.
Change-Id: I006a4becaeefb2a0cbef6f5d1fb56c7fc40b0170
The change enables:
(1) TLS for the Elasticsearch transport networking layer. The
transport networking layer is used for internal communication
between nodes in a cluster.
(2) TLS path between Elasticsearch and Ceph-rgw host.
Change-Id: Ifb6cb5db19bc5db2c8cb914f6a5887cf3d0f9434
We previously pinned the version of ansible that was ran at the gate
due to issues that are no longer impacting us. This change updates
the version of ansible that is deployed in the gate to something
more recent.
Change-Id: I47773eb385ef1b290d1548e8512fda1fec3cac60
The cmd2 package was pinned in order to maintain compatability,
however quite a bit of time has passed since doing so. This change
unpins cmd2 to use the latest version.
Change-Id: I2b9c8d4c1da91b55301d818861d29cccb64b28cd
Setuptools v54.1.0 introduces a warning that the use of dash-separated
options in 'setup.cfg' will not be supported in a future version [1].
Get ahead of the issue by replacing the dashes with underscores. Without
this, we see 'UserWarning' messages like the following on new enough
versions of setuptools:
UserWarning: Usage of dash-separated 'description-file' will not be
supported in future versions. Please use the underscore name
'description_file' instead
[1] https://github.com/pypa/setuptools/commit/a2e9ae4cb
Change-Id: I238b4e0ca237bca97236004856596002d088220c
With the removal of official support of all openstack releases
older than T, this change updates each job to at least use the
Train release.
Change-Id: I6b41d79495a74b1072995ae5036f56bfbf585c25
Helm2 has been deprecated [0] and along with that the need of tiller.
This patch set removes the tiller chart.
[0] https://helm.sh/blog/helm-v2-deprecation-timeline/
Change-Id: I02bafef5e8559c70fa2959f52e027fbf8a1f771c
Signed-off-by: Tin <tin@irrational.io>
This change enables TLS between Elasticsearch and Kibana
data path. Note that TLS terminates at apache-proxy container
of the Elasticsearch-client pod, not directly to port 9200 of
elasticsearch-client container.
Since all data traffic goes through apache-proxy container,
fluentd output to Elasticsearch are configured to have TLS
enabled as well.
In additon, other Elasticsearch pods that communicate with
Elasticsearch-client endpoint are modified to provide
the cacert option with curl.
Change-Id: I3373c0c350b30c175be4a34d25a403b9caf74294
These hooks were added as part of a previous change, however tiller
does not handle these correctly, and jobs get deleted without being
recreated. This change removes the hook from default htk annotations.
Change-Id: I2aa7bb241ebbb7b54c5dc9cf21cd5ba290b7e5fd
This change updates the default images for mariadb, both the version
to 10.5.9 and the ubuntu release to focal.
Change-Id: Iff99ebe78554197db4d459bef0dda01b6b2710b7
This ps removes the test_api_object_creation function as the api_objects map is now more
dynamic with ability to create, delete etc.
This function throws error when it does a GET on the objects that first
needs to be created(PUT).
This function is no longer relevant with the updated create-templates
job which is more robust.
Change-Id: I9f37c86ae9ca4bf32c417880926b6a3c3e78cb8a
There seems to be a race condition involving the grastate.dat file.
Upon creation of a new mariad-server pod the file would exist however,
it is not populated for a short period of time. It seems to take
around 15-20 seconds for this file to be populated. However there is
a separate thread which is attempting to read the file and tends to
end in an IndexError exception killing the thread which maintains the
grastate.dat file until the pod is restarted. This patchset adds a
loop to check for up to 60 seconds for the file to be populated
before attempting to continue, thus giving the file time to be
populated.
Change-Id: I2f2a801aa4528a7af61797419422572be1c82e75
This patchset makes the current ceph-client helm test more specific
about checking each of the PGs that are transitioning through inactive
states during the test. If any single PG spends more than 30 seconds in
any of these inactive states (peering, activating, creating, unknown,
etc), then the test will fail.
Also, if after the three minute PG checking period is expired, we will
no longer fail the helm test, as it is very possible that the autoscaler
could be still adjusting the PGs for several minutes after a deployment
is done.
Change-Id: I7f3209b7b3399feb7bec7598e6e88d7680f825c4
This change removes the releasenotes directory from the
irrelevant-files list in the zuul linter since the linter actually
checks those files, so for issues with the releasenotes it may
be difficult to test fixes when charts become out of date.
Change-Id: I3c4f95a5bc5fb8d9a0ec8dbb8d2f9560f1e46f9a
Memcached stats cachedump is enabled by default. Changes in this
pathset provide an option to configure stats cachedump as desired
during deployment i.e. the stats cachedump can be disabled to
prevent user obtaining sensitive info via the cachedump data.
Change-Id: Ic6254f89b1478a414ac275436ddd659b16b75f98
Currently ovs liveness and readiness probes commands are statically
defined in the templates, this change allow them to be change
as needed. This helps with debuging and making quick adjustment.
Change-Id: I75b4b5a335b75a52f4efbd4ba4ed007106aba4fa
The curator actions in the configmap gets set to
null which is causing error when redering any actions downstream.
Adding the {} should resolve this issue.
Change-Id: I8c337ee1f089c13f75cb7a9997a7bf6f04246160
Prometheus documentation shows that /-/ready can be used to check that
it is ready to service traffic (i.e. respond to queries) [0]. I've
witnessed cases where Prometheus's readiness probe is passing during
initial deployment using /status, which in turn triggers its helm test
to start. Said helm test then fails because /status is not a good a
reliable indicator that Prometheus is actually ready to serve traffic
and the helm test is performing actions that require it to be proprely
up and ready.
[0]: https://prometheus.io/docs/prometheus/latest/management_api/
Change-Id: Iab22d0c986d680663fbe8e84d6c0d89b03dc6428
This is the first of multiple updates to ceph-osd where the OSD
init code will be refactored for better sustainability.
This patchset makes 2 changes:
1) Removes "ceph-disk" support, as ceph-disk was removed from the
ceph image since nautilus.
2) Separates the initialization code for the bluestore, filestore,
and directory backend configuration options.
Change-Id: I116ce9cc8d3bac870adba8b84677ec652bbb0dd4