2864 Commits

Author SHA1 Message Date
Zuul
e4683420d7 Merge "Revert "Don't use opendev docker proxy"" 2020-12-07 14:51:47 +00:00
Gayathri Devi Kathiri
20d2aa1553 Update Rabbitmq exporter version
With current version of rabbitmq-exporter,
unable to retrieve data sometimes,
failing with rabbitmq timeout issues.
Rabbitmq timeout threshold is set as 10 sec
and is not configurable with current version.

Updating the rabbitmq-exporter version to
kbudde/rabbitmq-exporter:v1.0.0-RC7.1
(Default "RABBITMQ_TIMEOUT" set as 30 sec)
to solve rabbitmq timeout issues.

Change-Id: Ia51f368a1bba2b0fd9195cf9991b55864cdebfc1
2020-12-04 11:01:11 +00:00
Zuul
9187633822 Merge "Rabbitmq-exporter: Add configurable RABBIT_TIMEOUT parameter" 2020-12-03 20:56:00 +00:00
Gage Hugo
7fdf282271 Revert "Don't use opendev docker proxy"
This reverts commit 42f3b3eaf5a8794b1f247915fffbef68137e6c1c.

Reason for revert: dockerhub now sets a hard limit on daily pulls, lets switch back to using the opendev docker proxy.

Change-Id: I87e399c89d5736f39d7bdba2011655e5f5766180
2020-12-03 19:42:47 +00:00
Zuul
970ec5128a Merge "Make publish jobs more generic" 2020-12-02 19:51:46 +00:00
Gayathri Devi Kathiri
d7107a5c5c Rabbitmq-exporter: Add configurable RABBIT_TIMEOUT parameter
This PS adds RABBIT_TIMEOUT parameter as configurable 
with kbudde/rabbitmq-exporter:v1.0.0-RC7.1 version

Change-Id: I8faf8cd706863f65afb5137d93a7627d421270e9
2020-12-02 16:42:49 +00:00
Zuul
59164428d3 Merge "Fluentd: Add Configurable Readiness and Liveness Probes" 2020-12-01 20:22:57 +00:00
Steven Fitzpatrick
29489acf39 Fluentd: Add Configurable Readiness and Liveness Probes
This change updates the fluentd chart to use HTK probe templates
to allow configuration by value overrides

Change-Id: I97a3cc0832554a31146cd2b6d86deb77fd73db41
2020-11-30 18:39:07 +00:00
Taylor, Stephen (st053q)
e37d1fc2ab [ceph-osd] Add a check for misplaced objects to the post-apply job
OSD failures during an update can cause degraded and misplaced
objects. The post-apply job restarts OSDs in failure domain
batches in order to accomplish the restarts efficiently. There is
already a wait for degraded objects to ensure that OSDs are not
restarted on degraded PGs, but misplaced objects could mean that
multiple object replicas exist in the same failure domain, so the
job should wait for those to recover as well before restarting
OSDs in order to avoid potential disruption under these failure
conditions.

Change-Id: I39606e388a9a1d3a4e9c547de56aac4fc5606ea2
2020-11-30 10:17:40 -07:00
Zuul
3205c8b778 Merge "Fix values_overrides directory naming" 2020-11-27 19:22:34 +00:00
Zuul
5600c76e0b Merge "Changing the kube version to 1.18.9" 2020-11-27 19:20:56 +00:00
MirgDenis
5f6adeca06 Fix values_overrides directory naming
According to get-values-overrides.sh script it is expected to
have values_overrides directory, not value_overrides.

Change-Id: I53744117af6962d51519bc1d96329129473d9970
2020-11-27 10:59:20 +02:00
Taylor, Stephen (st053q)
791b0de5ee [ceph-osd] Fix post-apply job failure related to fault tolerance
A recent change to wait_for_pods() to allow for fault tolerance
appears to be causing wait_for_pgs() to fail and exit the post-
apply script prematurely in some cases. The existing
wait_for_degraded_objects() logic won't pass until pods and PGs
have recovered while the noout flag is set, so the pod and PG
waits can simply be removed.

Change-Id: I5fd7f422d710c18dee237c0ae97ae1a770606605
2020-11-24 06:30:37 -07:00
Zuul
15ad6e9a6c Merge "fix(secret): changes rmq-exporter secret src" 2020-11-23 22:52:55 +00:00
Zuul
02368a4d99 Merge "[ceph] Make sure loopback devices persistent across reboots" 2020-11-23 22:49:04 +00:00
Andrii Ostapenko
13315e57a7 Fix openvswitch gate issue with systemd 237-3ubuntu10.43
New systemd 237-3ubuntu10.43 bumps memlock limit from 16 to 64 MB [1]
which seems to cause issues with eBPF related operations in containers
run with root [2] as a possible root cause.

Here we have an option to downgrade systemd to previous available
version or to set previous default memlock limit to systemd defaults or
docker unit. Setting systemd DefaultLimitMEMLOCK in this commit.

[1] https://launchpad.net/ubuntu/+source/systemd/237-3ubuntu10.43
[2] https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1837580/comments/9

Change-Id: I55d14ffa47a7a29d059f2f3b502bb38be0a5dd3d
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
2020-11-22 14:04:15 +00:00
Tin Lam
f001105aad fix(secret): changes rmq-exporter secret src
This patch set changes the source of the rabbitmq-exporter's admin user
credential to leverage the existing secret rather than the values in the
Values.yaml file.

Change-Id: I1ad48ade3984e455d07be3a8b8ee3d9b25b449a2
Signed-off-by: Tin Lam <tin@irrational.io>
2020-11-19 18:16:48 -06:00
Mohammed Naser
ca60e1d875 Make publish jobs more generic
This will help in allowing the openstack-helm repo cleanly
publish to the seperate folder.

Change-Id: I2651c2f81191802a8f30314c4eebffdf0c2a53af
2020-11-11 19:22:55 -05:00
Gupta, Sangeet (sg774j)
c988632091 Changing the kube version to 1.18.9
Change-Id: I216d16de1f4fb1438534c9362b57499ec3d6725b
2020-11-09 23:15:33 +00:00
Chinasubbareddy Mallavarapu
515d31f9ae [ceph] Make sure loopback devices persistent across reboots
Change-Id: I50ddfcf0903fe00fc020c819e784ea289d5baae6
2020-11-09 21:23:03 +00:00
Andrii Ostapenko
ca372bfea6 Fix typo in check inactive PGs logic
Issue introduces in https://review.opendev.org/761031

Change-Id: I154f91e17b5d9a84282197ae843c5aab2ce1d0be
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
2020-11-09 17:53:41 +00:00
Zuul
577a7fdb6f Merge "Remove divingbell job" 2020-11-07 20:42:07 +00:00
Kabanov, Dmitrii
011e5876c0 [ceph-osd] Check inactive PGs multiple times
The PS updates post apply job and allows to check multiple times
inactive PGs that are not peering. The wait_for_pgs() function
fails after 10 sequential positive checks.

Change-Id: I98359894477c8e3556450b60b25d62773666b034
2020-11-03 00:50:42 +00:00
Gage Hugo
3182b01d82 Remove divingbell job
This change removes the non-voting divingbell job from
openstack-helm-infra checks due to not really being used to
test much functionality.

Change-Id: I343b4cdc98d637522ac854211a974cc86d49cae6
2020-10-30 13:29:22 -05:00
Chinasubbareddy Mallavarapu
7c8ca55ac0 [ceph-provisioners] Validate each storageclass created
This is to include every storageclass getting created part of
helm tests.

Change-Id: I62dc11600d00fe2ec7babb1688e61d3eaa50100c
2020-10-28 22:14:49 +00:00
Zuul
e74674324b Merge "Add capability to delete a backup archive" 2020-10-28 20:17:20 +00:00
Parsons, Cliff (cp769u)
2d1fe882bb Add capability to delete a backup archive
This patchset adds the capability to delete any archives that are stored
in the local file system or archives that are stored on the remote RGW
data store.

Change-Id: I68cade39e677f895e06ec8f2204f55ff913ce327
2020-10-28 16:19:31 +00:00
Andrii Ostapenko
22cfea81d0 Split deployment script sets to improve stability
Change-Id: I848d6ad0ce52863bf4a13b96b2afbf79bfaf70fc
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
2020-10-28 15:01:45 +00:00
okozachenko
63b7a0cd0f Update ingress tpl in helmtoolkit
- Check issuer type to distinguish the annotation between
clusterissuer and issuer
- Add one more annotation "certmanager.k8s.io/xx" for old version

Change-Id: I320c1fe894c84ac38a2878af33e41706fb067422
2020-10-28 07:06:51 +00:00
Andrii Ostapenko
42f3b3eaf5
Don't use opendev docker proxy
Look like using docker proxy is slower and less stable than pulling from
dockerhub directly and contributes to some part of unstable builds.

This reverts commit e3f14aaff35364b84acedf53b3778111cbae0373.

Change-Id: I9735ad35ce9240f610479a56eaa38715defa2e04
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
2020-10-27 10:33:40 -05:00
Zuul
757a353b70 Merge "postgresql: Revert "Add default reject rule ..."" 2020-10-24 02:27:40 +00:00
Kabanov, Dmitrii
d39abfe0f0 [ceph-osd] Update post apply job
The PS updates wait_for_pods() function in post apply script.
The changes allow to pass wait_for_pods() function when required percent
of OSDs reached (REQUIRED_PERCENT_OF_OSDS). Also removed a part of code
which is not needed any more.

Change-Id: I56f1292682cf2aa933c913df162d6f615cf1a133
2020-10-23 19:00:58 +00:00
Phil Sphicas
20288319af postgresql: Revert "Add default reject rule ..."
This reverts commit 982e3754a5755cc227552b6f1fcc195e8793589c.
"Add default reject rule end in Postgres pg_hba.conf to ensure all
connections must be explicitly allowed."

The original commit introduced a breaking change when installing with
the chart defaults - before, all remote connections with md5 auth were
allowed, and after the change, only explicit users are allowed.

This is fully overridable, but the original defaults are more
conservative.

Change-Id: Ib297e480bccd3ac7c0cf15985b3def2c8b3e889e
2020-10-23 17:50:50 +00:00
Phil Sphicas
c43331d67a postgresql: Optimize restart behavior
* add preStop hook to trigger Fast Shutdown
* disable readiness probe by default

When Kubernetes terminates a pod, the container runtime typically sends
a SIGTERM signal to pid 1 in each container [0]. PostgreSQL interprets
SIGTERM as a request to do a "Smart Shutdown" [1]. This can take minutes
(often exhausting the termination grace period), and during this time,
new connections are not being serviced.

Now that postgresql has a single replica, this behavior is undesirable.
If we kill the pod (e.g. in an upgrade), we probably want it to come
back as soon as possible.

This change adds a preStop hook that sends a SIGINT to postgresql in
order to trigger a "Fast Shutdown". In addition, the readiness probe is
disabled by default, since it adds no value in a single-replica
scenario.

0: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination
1: https://www.postgresql.org/docs/9.6/server-shutdown.html

Change-Id: Ib5f3d2a49e55332604c91f9a011e87d78947dbef
2020-10-23 07:41:57 +00:00
Phil Sphicas
a10699c4e0 postgresql: Allow probe tweaking
Uses the standard helm-toolkit macros for liveness and readiness probes,
allowing them to be enabled or disabled, and params to be overridden.

The existing hard-coded settings are preserved as the chart defaults.

Change-Id: Idd063e6b8721126c88fa22c459f93812151d7b64
2020-10-23 06:52:45 +00:00
KHIYANI, RAHUL (rk0850)
b4d0793b98 Add pod/contianer security context template to create_db.yaml
This enables the runAsUser and ReadOnly-fs flags overridden in
values.yaml

Change-Id: I2e5cbd57f90ef1f5c09b7a54cd04d92dcfd8edc5
2020-10-22 20:50:25 +00:00
Zuul
a7cfefddb5 Merge "Fix spacing inconsistencies with flags" 2020-10-22 15:39:33 +00:00
Zuul
9332c2961e Merge "Fix ks-user script case matching for domain" 2020-10-21 17:51:04 +00:00
Smith, David (ds3330)
9d9aaa8948 Fix spacing inconsistencies with flags
Change-Id: Ia8f7437071a8865f1470412ad616b67a38142719
2020-10-21 13:44:07 +00:00
Tin Lam
62b10c7d49 chore(pkg): updates the chart packaging
Part 2. This patch set adjusts the url once the initial packages are
make available.

Change-Id: Idfb69146d606b43c98c552d1d2c5680ccd503282
Signed-off-by: Tin Lam <tin@irrational.io>
2020-10-21 00:58:16 -05:00
Tin Lam
738c89b342 fix(job): fixes the post job
This corrects the ability to sync artifacts to tarballs.o.o.

Change-Id: Icb2b6653f263aaab173d1479d05c0209e7390c50
Signed-off-by: Tin Lam <tin@irrational.io>
2020-10-20 22:43:10 -05:00
Tin Lam
da81705a47 fix(post): fixes publish job
This fixes a typo of the publish job.

Change-Id: I077feb29a8764a0b3031b34b462779c911baaee3
Signed-off-by: Tin Lam <tin@irrational.io>
2020-10-19 11:53:46 -05:00
Gage Hugo
cddf665c16 Fix ks-user script case matching for domain
Some services attempt to recreate the default domain
with both the values of "default" and "Default". Since this
domain already exists when keystone is deployed, this
creates redundant API calls that only result in conflicts.

This change enables nocasematch for string checking in order
to avoid making multiple unnecessary calls to keystone.

Change-Id: I698fd420dc41eae211a511269cb021d4ab7a5bfc
2020-10-19 05:03:58 +00:00
Tin Lam
e5c776e5c4 chore(pkg): updates the chart packaging
This patch set updates the ability to package (and subsequent publish)
of the charts in the OpenStack-Helm-Infra repository.

Change-Id: I6175325b0e7a668c22a7ec3ab08cae51ad4f9ab8
Signed-off-by: Tin Lam <tin@irrational.io>
2020-10-17 08:42:53 +00:00
Zuul
a282491ba6 Merge "[ceph-client] fix the logic to disable the autoscaler on pools" 2020-10-17 01:47:54 +00:00
Chinasubbareddy Mallavarapu
c3f921c916 [ceph-client] fix the logic to disable the autoscaler on pools
This is to fix the logic to disable the autosclaer on pools as
its not considering newly created pools.

Change-Id: I76fe106918d865b6443453b13e3a4bd6fc35206a
2020-10-16 21:17:07 +00:00
Stephen Taylor
16b72c1e22 [ceph-osd] Synchronization audit for the ceph-volume osd-init script
There are race conditions in the ceph-volume osd-init script that
occasionally cause deployment and OSD restart issues. This change
attempts to resolve those and stabilize the script when multiple
instances run simultaneously on the same host.

Change-Id: I79407059fa20fb51c6840717a083a8dc616ba410
2020-10-16 18:30:57 +00:00
Tin Lam
3a2d0f83b4 chore(charts): addresses issues with chart publish
This changes attempts to address the chart publish issue. Also makes
the job periodic.

Change-Id: I806da82a7eb07ce8e83ae8c023a014fa3b917193
Signed-off-by: Tin Lam <tin@irrational.io>
2020-10-16 15:15:35 +00:00
Zuul
af712da863 Merge "Update image version from v2.0.0-alpha to v2.0.0-alpha-1" 2020-10-15 15:34:22 +00:00
Chinasubbareddy Mallavarapu
321b8cb7e3 [ceph-osd] Logic improvement for used osd disk detection
This is to improve the logic to detect used osd disks so that scripts will
not zap the osd disks agressively.

also adding debugging mode for pvdisplay commands to capture more logs
during failure scenarios along with reading osd force repair flag from
values.

Change-Id: Id2996211dd92ac963ad531f8671a7cc8f7b7d2d5
2020-10-15 13:13:28 +00:00