Sometimes jobs fail, the default of 6 retries is far too brief to get
logs (which are purged after the final failure); as we need the jobs
to succeed always, having a much higher default here seems prudent.
Change-Id: I7f20a3eb9a98669ae4af657d36a776830b82dfca
This is to fix the logic to find osd id for wal lvm and also
to find correct lvm device for osd disk.
Change-Id: Id4ee1dbd5c82dcbe9893f81c3ad3b9e18d1f9509
This is to fix the logic to use osd device name instaed of whole disk path
while osd initilizing.
also correct the ceph osd ls command to use correct keyring.
Change-Id: I90f0c3fd5d1e1b835326b1c690582990f7ca15cb
This is to wait for all the osd devices before initializing and also
to add few more checks to make sure disk is used or not .
Change-Id: I68e1d4c8c1ade39f856c69333585dfcba3ea35ab
This commit adds an audit user to the postgresql database which
will have only SELECT privileges on the postgresql database tables.
This is accomplished by setting up audit user creation parameters
in the Patroni bootstrap environment settings, according to (1).
(1) https://patroni.readthedocs.io/en/latest/ENVIRONMENT.html
Change-Id: Idf1cd90b5d093f12fa4a3c5c794d4b5bbc6c8831
In this PS we explicitly define the admin user rather than letting
patroni use the default username and password.
Change-Id: I9885314902c3a60e709f96e2850a719ff9586b3d
The values.yaml in the LDAP chart contains a duplicate network_policy:
key in the manifests: section. This patch removes the duplicate.
Change-Id: I677acaf7d96d92fecb93c30782f1e760ab4bec84
Signed-off-by: Tin Lam <tin@irrational.io>
When DPDK is enbaled, configuring CPU resource limits
through Kubernetes affects packet throughput adversely.
DPDK PMD cores could not get 100% busy.
They need to be configured by isolating them in host grub
and later through PMD core mask.
Change-Id: Ia80880302b9c5c02fdb1c00cb62f6640860e898e
An audit user is added to Mariadb with only the SELECT permission
to mysql database user table for database user audit purposes.
Change-Id: I5d046dd263e0994fea66e69359931b7dba4a766c
This moves from using the docker profile to the default
runtime profile - which allows container engines other than
docker to work out of the box.
Change-Id: Ica5a48f8c43b90f07969b41e10dc472a772b5b43
Signed-off-by: Pete Birley <pete@port.direct>
Validate that the container bucket exist and if so
delete it and its objects that were orphaned from a
a failed deployment helm-tests.
Change-Id: Ibaa6d0f6dd36b319c354b65e43dc6053418f4d1d
In Ceph Cluster Dashboard the OSDs In, OSDs Out, OSDs Down Panel was
showing wrong values. Updated
the expression from "count" to "sum" to show the correct values.
Change-Id: I1959eeb445bf297c1ec696f3867315f05552b03e
This patch set places in a default kubernetes egress network
policy for postgresql database chart.
Change-Id: I6caa917faf23becc3a1c09b47f457b8b2db996e4
Signed-off-by: Tin Lam <tin@irrational.io>
This change adds a means of introducing new storage classes
and local persistent volumes.
Change-Id: I340c75f3d0a1678f3149f3cf62e4ab104823cc49
Co-Authored-By: Steven Fitzpatrick <steven.fitzpatrick@att.com>
This patch set fixes a mismatch in the CN in the sample LDAP data.
Change-Id: Ie4c1cc46355e930b6b5bd65b5a55da11df1acd75
Signed-off-by: Tin Lam <tin@irrational.io>
This updates the Elastic Beats charts to 7.1.0 to keep them
aligned with the Kibana and Elasticsearch chart versions, which
is required for compatibility
This also updates the experimental job to use the single node
minikube deployment as opposed to the standard 5 node multinode
deployment
Change-Id: I4baba6ca2ea2f3785f11905138b67979a4501caa
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This patch set updates and tests the apiVersion for rbac.authorization.k8s.io
from v1beta1 to v1 in preparation for its removal in k8s 1.20.
Change-Id: I4e68db1f75ff72eee55ecec93bd59c68c179c627
Signed-off-by: Tin Lam <tin@irrational.io>
This change adds in missing network policy overrides for
fluent-daemonset and prometheus-exporter, as well as removes
existing mariadb network policies overrides that were causing
the network policy check job to fail.
Change-Id: Ib7a33f3d14617f9a9fda264f32cde7729a923193
This updates the overrides used in the apparmor nonvoting job, as
recent changes to the Elasticsearch chart values structure have
resulted in this jobs repeated failure
Change-Id: Id5427cd19a382e72435ab361003bbd5f99d678ce
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This updates the deployment scripts for Prometheus to leverage the
feature gate functionality rather than bash generation of the list
of override files to use for alerting rules
Change-Id: Ie497ae930f7cc4db690a4ddc812a92e4491cde93
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This is to fix the issue with ceph-osd initilization when deployed
with wal and db on same disk as pod restart always trying to prepare
the disk.
this ps will make sure to handle the case and skip the ceph-volume prepare
step in case of already deployed osd disk.
Change-Id: I5c37568f342cb4362a0de0a9c11a52b7aea3e147
This removes a duplicated key in the values.yaml in the
ceph-client chart.
Change-Id: Iff4969fc1de7f0b1d34d3aac63ffac835c8fc7ed
Signed-off-by: Tin Lam <tin@irrational.io>
Removes become: and become_user: when including another role (that
already defines become: true and become_user: root)
Fixes an error occurring in the gates:
ERROR! 'become_user' is not a valid attribute for a IncludeRole
Change-Id: I362eefbe5b09ad64e97b3b541d07db2e6b990613
I noticed a some nagios service checks were checking prometheus
alerts which did not exist in our default prometheus configuration.
In one case a prometheus alert did not match the naming convention
of similar alerts.
One nagios service check, ceph_monitor_clock_skew_high, does not
have a corresponding alert at all, so I've changed it to check the
node_ntmp_clock_skew_high
alert, where a node has the label ceph-mon="enabled".
Change-Id: I2ebf9a4954190b8e2caefc8a61270e28bf24d9fa
nginx-ingress-controller 0.26.1 introduces configurable parameters for
streamPort and profilerPort, and changes the default for statusPort.
This change allows those parameters to be configured, while maintaining
compatibility with earlier versions of nginx-ingress.controller. It also
modifies the default status port value from 18080 to 10246.
Reference: https://github.com/kubernetes/ingress-nginx/blob/master/Changelog.md#0261
Change-Id: I88a7315f2ed47c31b8c2862ce1ad47b590b32137
k8s 1.14 first enabled Ingress in the networking.k8s.io/v1beta1 API
group, while still serving it in the extensions/v1beta1 API group. The
extensions/v1beta1 API endpoint is deprecated in 1.16 and scheduled for
removal in 1.20. [0]
ingress-nginx 0.25.0 actually uses the networking.k8s.io/v1beta1 API,
which requires updated RBAC rules. [1]
This change updates the ClusterRole used by the ingress service account
to grant access to Ingress resources via either the extensions/v1beta1
or networking.k8s.io/v1beta1 API, aligning with the static manifests
from the kubernetes/ingress-nginx repo [2]. It does not change the
apiVersion used when creating Ingress resources.
[0] https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/
[1] https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.25.0
[2] 870be3bcd8/deploy/static/mandatory.yaml (L50-L106)
Change-Id: I67d4dbdb3834ca4ac8ce90ec51c8d6414ce80a01
This change adds a non-voting bandit check to openstack-helm-infra
similar to what is ran in the openstack-helm repo.
This check will be made voting in a future change once the current
failures are addressed.
Similarly this check will be modified in a future change to
only be ran when affected python files are changed.
Change-Id: I177940f7b050fbe8882d298628c458bbd935ee89