This patch set places logic to generate kubernetes egress network policy
rule based on the dependencies specified in values.yaml. This also sets
up the necessary default network policy for the OSH gate.
Change-Id: I1ac649cc9debb5d1f4ea0a32f506dcda4d8b8536
Signed-off-by: Tin Lam <tin@irrational.io>
This change replaces direct references to the exporter port
in values.yaml with calls to helm-toolkit lookup functions.
The referenced port number under the network key is removed,
as the helm-toolkit function will return the port number under
the endpoints key.
Change-Id: Ib6f533c49af5a88fca377920d28d5468d7387892
This updates the gather-prom-metrics role to include gathering
metrics from the active ceph-mgr endpoint
Change-Id: Icb5d27b6a070e9065f6276725bf06dec7d2cbc0d
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This updates the podManagementPolicy to 'Parallel' for Prometheus
and Alertmanager, as there's no need to handle deploying these
two services in a sequential manner
Change-Id: I2f33b9651bed20c4cb2e0c477ae2227cbf9310cf
Signed-off-by: Steve Wilkerson <sw5822@att.com>
When the ingress pod (in routed mode, using a managed vip) moves from
one host to another, it is sometimes observed that: 1. the vip interface
is not removed on the original host, and 2. in some network topologies,
the switch fabric is unable to find the new pod.
This change updates the ingress deployment as follows:
Adds a 5s sleep before the shutdown of the ingress container in order to
allow the preStop action of the ingress-vip container to run completely.
Updates the start action of the ingress-vip-init container to check if
the vip is part of an existing connected subnet, and if so, sends a few
gratuitous ARP messages to let the switch fabric to build its ARP cache.
Change-Id: I784906865358566f42157dc2133569e4cb270cfa
ceph-disk has been deprecated and ceph-volume
is available from luminous release. uplifting
ceph-osd charts to use ceph-volume with support
of all below combinations
Filestore:
ceph-disk to ceph-volume
ceph-volume to ceph-volume
Bluestore: (including db, wal combinations)
ceph-disk to ceph-volume
ceph-volume to ceph-volume
support for different osds to run different stores
and upgrade with db, wal combinations
cross upgrade from store isn't supported
Story: ceph-volume-support
Signed-off-by: Kranthi Guttikonda <kranthi.guttikonda@att.com>
Co-Authored-By: Chinasubbareddy Mallavarapu <cr3938@att.com>
Change-Id: Id8b2e1bda0d35fef2cffed6a5ca5876f3888a1c7
This removes the elasticsearch-ldap.sh script from the single node
osh-infra-logging job, as this step does not provide any real
value and is tightly coupled to the elasticsearch version used.
This sort of validation should be reserved for smoke tests in
future helm tests for charts
Change-Id: I7ca4805a8809568cb09c8bab6c239c008528fd6a
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This is to create a different folder for ceph-disk based deplyoments
so that it will be easy to maintain when we introduce ceph-volume.
Separate folder for both the tools gives us flexibilty to develop or
fix the issues and commit the code to respective folders without breaking
other tool-based deployments.
Change-Id: Ib0099d292a8692dc6676eb5ed624d5d1ef677cfe
This updates the Prometheus version deployed by default from
2.3.2 to 2.12.0
Change-Id: Ic10e02a6b136a7f65fb686f5ef1adf1bcf6a9a9d
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This updates the Grafana version deployed by default from 5.0.0 to
6.2.0
Change-Id: I39b5405cc3f3fe7754ed6544a8388ff912a4ef58
Signed-off-by: Steve Wilkerson <sw5822@att.com>
This proposes adding a zookeeper chart to osh-infra that aligns
with the design patterns laid out by the other charts in osh-infra
and osh.
Change-Id: I25edc58fc951e7f81f7275ade6cf9c97e0afae02
Signed-off-by: Steve Wilkerson <sw5822@att.com>
Co-Authored-By: Steven Fitzpatrick <steven.fitzpatrick@att.com>
This updates the ceph health check command in Nagios to use the
updated plugin that determines the active ceph-mgr instance
endpoint to use before querying for ceph's health. This results in
more robust and reliable reporting of ceph's overall health
Depends-On: https://review.opendev.org/#/c/693900/
Change-Id: I5eeb076e5af3c820dbdcc3cc321cefcb5f85ef8d
Signed-off-by: Steve Wilkerson <sw5822@att.com>
Trivial change. This patch set cleans up a python script.
- Move the comment to a helm-template comment so the python comments do
not get rendered by helm.
- Remove an unused python module.
Change-Id: Id287ddae8904d2cfa88725277bb97cf027a942c3
Signed-off-by: Tin Lam <tin@irrational.io>
This patch set will implement the grafana metrics related changes
required for kubernetes version upgrade to 1.16. Updates are mostly
specific to cadvisor metric labels. It is to make sure all
existing metrics are scraped and available in Prometheus so that
these can be consumed by Grafana & Nagios.
Change-Id: I74369ac49dd3f7d9f3682dd5318a3818a4d3f178
AppArmor annotations require the container name to be applied properly.
Before this change, when overrides are not used, the container name is
ceph-osd-default. When overrides are used, the container name is of the
form ceph-osd-HOSTNAME-SHA, but with an identical HOSTNAME and SHA for
all the daemonsets. However, it is not possible to predict this value,
and as a result, the AppArmor profiles are not applied.
This change removes the customization of the container name, and sets
it to ceph-osd-default, allowing AppArmor annotations to be consistently
applied using:
pod:
mandatory_access_control:
type: apparmor
ceph-osd-default:
ceph-osd-default: localhost/profilename
Change-Id: I8b6eda00f77ec7393a4311309f3ff76908d06ae6
The patch adds Network Policy ingress rules for RabbitMQ
and Prometheus RabbitMQ exporter.
It also fixes name generation for network policies,
to make sure they do not contain a prohibited '_' symbol,
which may appear in some label names.
Change-Id: I9821983b61d90e73e62c5ac669eefeb4ba9999d2
It was observed in some charts' values.yaml that the values defining
lifecycle upgrade parameters were incorrectly placed.
This change aims to correct these instances by adding a deployment-
type subkey corresponding with the deployment types identified in
the chart's templates dir, and indenting the values appropriately.
Change-Id: Id5437b1eeaf6e71472520f1fee91028c9b6bfdd3
This updates the ingress objects to move them back to the
extensions API. While 1.16 moves them under the networking
api, they're still rendered and deployed as extensions/ objects.
This move prevents issues from arising where older versions of
kubernetes might still be deployed during an upgrade, as the
move to the networking API is nonfunctional at this time
Change-Id: I814bbc833b5b9f79f34aefc60b9c1f9890bca826
Signed-off-by: Steve Wilkerson <sw5822@att.com>
Pods for some of the CronJobs do not have correct
application and component labels applied, they are
unable to start if Network Policies are enabled.
Related-Change: Ie4eed0e9829419b4b2e40e9b712b73a86d6fc3d2
Change-Id: Ieee874bf837c7947e3681e0447d150174c99d880
The openvswitch-vswitchd pod should not start until there is a Ready
openvswitch-vswitchd-db pod on the same node. This change adds the
appropriate dependency to cause it to wait.
Change-Id: I5c827971c99639d2f1c3a24a1761524b3a165421
It was observed that sometimes during
galera ckuster restart the node with highest
seqno is determined incorrecly. After investigation
it was found that max function is invoked on the
list of string values which can lead to incorrect results.
This patch performs casting the value to integer before building
list of seqnos hence max function will return correct result
Change-Id: I604ec837f3f2d157c829ab43a44e561879775c77