This patch enhances the HTK job manifest functions so that each job can
be configured to use the desired backoffLimit and activeDeadlineSeconds,
and can mount the command/script from either a configMap or a secret
instead of being confined to using only configMaps.
Change-Id: I5231e53b98e3e55e3e93070876d8694f37ad642d
Appended the code that will add the calico dashboard to the Grafana. This will
display the felix metrics which are collected by the prometheus.
Change-Id: If18a18949f8093747b3f9ba819e036778c40b84e
We can select if we want an image with dpdk support by adding:
FEATURE_GATES=dpdk
That way we can reuse the same script for different distros by using
openstack-helm/tools/deployment/common/get-values-overrides.sh
Change-Id: Ia2c53556be650899fdd67c1ec06f5c68ae63c9d4
Signed-off-by: Manuel Buil <mbuil@suse.com>
As per PR, https://github.com/kubernetes/kubernetes/pull/60210,
in kubectl get show-all option is deprecated and no longer needed.
Presumably now that's the default behavior.
Also in current logs gathering logic, we are interested in capturing
only pod names, so removing that option is harmless.
We are seeing related failures in local CI when kubectl version is
1.15.x. So removing this option.
Change-Id: I3886c792fe28bc8b80504d8c91e9524039131b15
This updates the script for registering snapshot repositories to
include a manual verification of the repositories created. This
simply allows for inspection of all master and data nodes the
repository is verified with to provide additional visibility into
the state of all repositories
Change-Id: I6e5386386e2b79b1cb0f41fc1f9b78817695f8f3
Signed-off-by: Steve Wilkerson <sw5822@att.com>
Revert 833d426da8e4b049277ca9847830f6e6beee40c3
https://review.opendev.org/#/c/667022 introduced a regression in the
overrides functionality, which caused the corresponding gate test to
fail. This "fixed" a problem by breaking the override capability.
This patchset reverts the previous to restore override functionality and
make gates green again. Deep copy is added in order to resolve the
original problem that 667022 attempted to resolve.
Change-Id: I6c052c0fabe0067612d6a3d9d3bfac4df59202d7
This is to fix the issue we are facing with file permision on the file
/var/lib/ceph/bootstrap-rgw/ceph.keyring since owner of the file
will be root.
This is happening when node with rgw reboots and rgw pods fails at
init after reboot,this is happening on sinlge node deplyoments.
issue:
ceph-rgw-5db485fbd9-dv778 0/1 Init:CrashLoopBackOff 5 6m49s
logs:
+ chown -R ceph. /run/ceph/ /var/lib/ceph/bootstrap-rgw /var/lib/ceph/radosgw
/var/lib/ceph/tmp
chown: changing ownership of
'/var/lib/ceph/bootstrap-rgw/ceph.keyring': Operation not permitted
Change-Id: Idcb648c205053b2f03357b59173e70e02f28688c
Earlier the Nagios alert monitor was percent based as in when the percent of OSD
down is greater than 80, it will send alert.
>check_prom_alert!ceph_osd_down_pct_high!CRITICAL- CEPH OSDs down is
more than 80 percent!OK- CEPH OSDs down is less than 80 percent
Updated the code in nagios values.yaml to send alert when even 1 OSD is
down:
>check_prom_alert!ceph_osd_down!CRITICAL- One or more CEPH OSDs are down
>for more than 5 minutes!OK- All the CEPH OSDs are up
Change-Id: Id24c4a0cca64674890dae3599edc0c90d9534e90
This patch is part of an effort to cleanup the values.yaml file for
Postgres, which has gotten messy since the introduction of Patroni. This
patch specifically removes unused configuration values which were
causing unnecessary bloat and complexity.
Change-Id: I96180fd9c91200ba7558e58bd503b4ef9ebc183e
When upgrading/reconfiguring a rabbit cluster its possible that the nodes
will not return the cluster status for some time, this ps allows us to
cope with this much more gracefully than simply crashing a few times, before
proceeding.
Change-Id: Ibf525df9e3a9362282f70e5dbb136430734181fd
Signed-off-by: Pete Birley <pete@port.direct>
The tag is pointing to a libvirt image. It should point to the
openvswitch image
Change-Id: If95a7b9cce2cadcb644389c28799fff48572c549
Signed-off-by: Manuel Buil <mbuil@suse.com>
This PS updates the cluster wait job to prune any extra nodes from
the cluster if scaling down.
Change-Id: I58d22121a07cd99448add62502582a6873776622
Signed-off-by: Pete Birley <pete@port.direct>
This PS cleans up the container dir entirely on container restart,
as sometimes remnets of previous runs can cause issues.
Change-Id: I873667a8a57bca6096cbe777ee83ef8648a368d4
Signed-off-by: Pete Birley <pete@port.direct>
Currently, we are getting `bind-address: null` in ingress-conf for ingress pod in kube-system namespace
In that case, nginx starting on 0.0.0.0:80 which breaks other ingress controllers, such as maas-ingress.
All further ingress controllers can't start because they can't bind on 80 port.
Change-Id: Ie7e9563bf14fe347969bea0d3c900c8d87d06de0
Recently, the Minikube gate script was modified to support Ubuntu Bionic
[0]; however, the change made the script incompatible with Ubuntu Xenial
because libxtables12 is not available on Ubuntu Xenial. OpenStack-Helm
still supports Ubuntu Xenial, and this script should too.
This change modifies the gate script to install iptables instead of
libxtables12. The iptables package depends on libxtables11 on Ubuntu
Xenial and libxtables12 on Ubuntu Bionic, so this achieves the same
result.
[0] https://review.opendev.org/650523
Change-Id: I5afbcfeca6e7b30857a44aed35a360595eeb5037
Signed-off-by: Drew Walters <andrew.walters@att.com>
This updates the tenant ceph job to provision the cephfs storage
class by removing the override that prevents it. This is required
for the ceph namespace activation deployment for osh-infra to
successfully pass its helm tests
Change-Id: I3f801cb2a369f6a073105296d7cc4f98fddf6a68
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This mvoes the default timeout for the ceph provisioners helm test
pod to 600 seconds, as 120 seconds is fairly aggressive. This
also adds the required --timeout flag to the helm test command in
each job for the ceph provisioners, as well as adding the required
helm test configuration to the armada-lma manifest
Change-Id: I5a3b98de9132fe83cf09b1e5b3fcc513bd496650
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This addresses issues with the armada-lma manifest that arose
after the splitting of the fluentbit and fluentd charts. The top
level labels key was missing from the fluentbit chart and the
logging chart group still referenced a nonexistent fluent-logging
chart
Change-Id: I5244fc9d065806c376ca5d18b6ced9ed445057c9
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This updates the task in the disable-local-nameserver role to
include disabling the systemd-resolved service, as this causes
the entries we update in /etc/resolv.conf to not be honored as
systemd-resolved will use a different set of files for configuring
the nameservers it uses.
See: https://www.freedesktop.org/software/systemd/man/systemd-resolved.service.html
Change-Id: I68a623b7bcb32037b9eeff2d76c7f2cb317cb7d8
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This updates the minikube deployment script to patch the
calico-node daemonset to set the appropriate annotations and
environment variables required for felix to expose prometheus
metrics
Change-Id: Ic5dc2ecb298add12cd3b150cc4d26e7639c43488
Signed-off-by: Steve Wilkerson <sw5822@att.com>
Extending the Openvswitch chart with support for DPDK. In order to
enable DPDK support, set the dpdk:enabled option to true in value.yaml.
Prerequisites for successfully running OVS with DPDK: the host OS must
to have hugepages enabled.
Co-Authored-By: Rihab Banday <rihab.banday@ericsson.com>
Change-Id: I9649832511ba7c7ba7c391555d60171ef9264110
Added new HTTP Security header Content-Security-Policy:self to make
sure the browser does not allow any cross-site scripting attacks.
Added new HTTP Security header X-Permitted-Cross-Domain-Policies:none
To prevent web client to load data from the current domain.
Added new HTTP Security header X-XSS-Protection:1 mode=block to
sanitize the page, when a XSS attack is detected, the browser will
prevent rendering of the page.
Change-Id: Ief137738f4b793f49f3632e25339c6f49492fd80
Grafana helm test is failing with the below error
"NameError: name 'exception' is not defined"
This is because exception is defined in smaller case. changing
exception to Exception fixes this issue
Change-Id: I533ae822babb4f063242fee1cd42b5b821519b5f
Signed-off-by: Sreejith Punnapuzha <Sreejith.Punnapuzha@outlook.com>