2142 Commits

Author SHA1 Message Date
Zuul
2a33842a9f Merge "Move ingress config to separate configmap" 2019-11-25 15:28:21 +00:00
Zuul
05c3ec119b Merge "Grafana: Support multiple datasources" 2019-11-22 16:43:18 +00:00
Steve Wilkerson
97e029e606 Grafana: Support multiple datasources
This updates the Grafana chart to support the definition of
multiple datasources. This moves to defining a template in the
chart's values.yaml file that allows for inline gotpl for
defining an arbitrary number of datasources. This also updates the
grafana dashboards to include a selector for the Prometheus
datasource to use via a drop down selector. This is vetted out in
the federated monitoring job

Change-Id: I55171fed5c2b343130d135d0b42bc96ff11c4712
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-22 14:45:04 +00:00
Steve Wilkerson
0b86616c6f Make keystone-auth job nonvoting
This makes the keystone-auth job nonvoting, until adequate work
can be done to help make the job more reliable. At the moment,
this job seems to be responsible for the majority of the gate job
failures due to what seems to be limitations with the single node
nodesets available

Change-Id: I08f1f10b79e9a5fd82ef7c6d887a03ccb55cceed
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-22 14:44:52 +00:00
Zuul
4a4676b3f7 Merge "Add rally environment cleanup" 2019-11-22 06:26:31 +00:00
Zuul
873838f11f Merge "Prometheus: Update chart to support federation" 2019-11-22 05:27:18 +00:00
Tin Lam
0dd938d1be Add rally environment cleanup
This patch set add command to clean up a rally environment after a helm
test's execution is completed.

Change-Id: I652ee4930e7afb8b278250a0432086a2963a528c
Signed-off-by: Tin Lam <tin@irrational.io>
2019-11-22 04:32:48 +00:00
Zuul
14e712c9df Merge "Re-enable experimental jobs in osh-infra" 2019-11-22 00:08:51 +00:00
Zuul
108f89b208 Merge "Update egress HTK method" 2019-11-22 00:08:50 +00:00
Zuul
2b3d3ef131 Merge "Move charts off using the :latest built tags" 2019-11-21 21:34:04 +00:00
Zuul
7a8fbb17d1 Merge "Grafana: Add support for arbitrary environment variables" 2019-11-21 20:47:47 +00:00
Tin Lam
3121fc24c5 Update egress HTK method
This patch set places logic to generate kubernetes egress network policy
rule based on the dependencies specified in values.yaml. This also sets
up the necessary default network policy for the OSH gate.

Change-Id: I1ac649cc9debb5d1f4ea0a32f506dcda4d8b8536
Signed-off-by: Tin Lam <tin@irrational.io>
2019-11-21 20:05:34 +00:00
Steve Wilkerson
cbeb7f149b Move charts off using the :latest built tags
This updates charts that consume images built from osh-images to
use tags other than the :latest tags. This will be followed up
with the definition of jobs to allow for vetting out of updated
images, as reliance on :latest tags assumes any change merged into
osh-images will result in functionally correct behavior (which has
shown to not be the case traditionally)

Change-Id: I181aa56ed187604dc7583d8081e53cc69eb27310
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-21 19:57:07 +00:00
Steve Wilkerson
eabc9fad64 Re-enable experimental jobs in osh-infra
This adds the experimental jobs back to osh-infra, as they were
erroneously disabled via comments in a previously merged change

Change-Id: Id92c24223f8c22f1a0ff82b62c222b2920ecd929
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-21 13:53:00 -06:00
Zuul
092709d875 Merge "RabbitMQ Exporter: Replace Direct Values w/ HTK" 2019-11-21 18:22:55 +00:00
Zuul
b6828f9e6a Merge "Add ceph metrics to postrun metrics gathering role" 2019-11-21 17:28:43 +00:00
Steven Fitzpatrick
ca6ad711a4 RabbitMQ Exporter: Replace Direct Values w/ HTK
This change replaces direct references to the exporter port
in values.yaml with calls to helm-toolkit lookup functions.

The referenced port number under the network key is removed,
as the helm-toolkit function will return the port number under
the endpoints key.

Change-Id: Ib6f533c49af5a88fca377920d28d5468d7387892
2019-11-21 12:52:55 +00:00
Steve Wilkerson
a4816feda2 Grafana: Add support for arbitrary environment variables
This updates the Grafana chart to support the definition of
arbitrary environment variables to support scenarios where
additional information may be required at runtime for things like
datasource and dashboard provisioning

Change-Id: I95e4abe9030116a440c6d78a1d14dbcaaf743b40
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-21 12:40:04 +00:00
Steve Wilkerson
fbd34421f2 Prometheus: Update chart to support federation
This updates the Prometheus chart to support federation. This
moves to defining the Prometheus configuration file via a template
in the values.yaml file instead of through raw yaml. This allows
for overriding the chart's default configuration wholesale, as
this would be required for a hierarchical federated setup. This
also strips out all of the default rules defined in the chart for
the same reason. There are example rules defined for the various
aspects of OSH's infrastructure in the prometheus/values_overrides
directory that are executed as part of the normal CI jobs. This
also adds a nonvoting federated-monitoring job that vets out the
ability to federate prometheus in a hierarchical fashion with
extremely basic overrides

Change-Id: I0f121ad5e4f80be4c790dc869955c6b299ca9f26
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-21 12:39:56 +00:00
Zuul
0edd3e18de Merge "Update podManagementPolicy for Prometheus and Alertmanager" 2019-11-21 00:41:03 +00:00
Zuul
d85a41c3c1 Merge "ceph-volume integration to ceph-osd charts" 2019-11-20 22:03:45 +00:00
Steve Wilkerson
ef4cbb3b08 Add ceph metrics to postrun metrics gathering role
This updates the gather-prom-metrics role to include gathering
metrics from the active ceph-mgr endpoint

Change-Id: Icb5d27b6a070e9065f6276725bf06dec7d2cbc0d
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-20 21:42:49 +00:00
Steve Wilkerson
c1555920e5 Update podManagementPolicy for Prometheus and Alertmanager
This updates the podManagementPolicy to 'Parallel' for Prometheus
and Alertmanager, as there's no need to handle deploying these
two services in a sequential manner

Change-Id: I2f33b9651bed20c4cb2e0c477ae2227cbf9310cf
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-20 21:37:55 +00:00
Phil Sphicas
6ca136bae4 Ingress chart managed VIP fixes cleanup/startup
When the ingress pod (in routed mode, using a managed vip) moves from
one host to another, it is sometimes observed that: 1. the vip interface
is not removed on the original host, and 2. in some network topologies,
the switch fabric is unable to find the new pod.

This change updates the ingress deployment as follows:

Adds a 5s sleep before the shutdown of the ingress container in order to
allow the preStop action of the ingress-vip container to run completely.

Updates the start action of the ingress-vip-init container to check if
the vip is part of an existing connected subnet, and if so, sends a few
gratuitous ARP messages to let the switch fabric to build its ARP cache.

Change-Id: I784906865358566f42157dc2133569e4cb270cfa
2019-11-20 07:25:50 -08:00
kranthikirang
41684a3c29 ceph-volume integration to ceph-osd charts
ceph-disk has been deprecated and ceph-volume
is available from luminous release. uplifting
ceph-osd charts to use ceph-volume with support
of all below combinations

Filestore:
ceph-disk to ceph-volume
ceph-volume to ceph-volume

Bluestore: (including db, wal combinations)
ceph-disk to ceph-volume
ceph-volume to ceph-volume

support for different osds to run different stores
and upgrade with db, wal combinations

cross upgrade from store isn't supported

Story: ceph-volume-support
Signed-off-by: Kranthi Guttikonda <kranthi.guttikonda@att.com>
Co-Authored-By: Chinasubbareddy Mallavarapu <cr3938@att.com>
Change-Id: Id8b2e1bda0d35fef2cffed6a5ca5876f3888a1c7
2019-11-20 10:02:08 -05:00
Mykyta Karpin
2cffc4e3ae Move ingress config to separate configmap
Currently when updating configuration for mariadb, ingress pods also
are being restarted, however there were no reasons for this.

Change-Id: I398e20541a0e2337e9a5d100f3ef6ce4ad7d0284
2019-11-20 14:14:09 +00:00
Steve Wilkerson
4e7b8a183e Remove elasticsearch ldap test from osh-infra-logging
This removes the elasticsearch-ldap.sh script from the single node
osh-infra-logging job, as this step does not provide any real
value and is tightly coupled to the elasticsearch version used.
This sort of validation should be reserved for smoke tests in
future helm tests for charts

Change-Id: I7ca4805a8809568cb09c8bab6c239c008528fd6a
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-20 12:52:51 +00:00
Chinasubbareddy Mallavarapu
2b42632c9b [ceph-osd] Separate ceph-disk based deployment scripts
This is to create a different folder for ceph-disk based deplyoments
so that it will be easy to maintain when we introduce ceph-volume.

Separate folder for both the tools gives us flexibilty to develop or
fix the issues and commit the code to respective folders without breaking
other tool-based deployments.

Change-Id: Ib0099d292a8692dc6676eb5ed624d5d1ef677cfe
2019-11-19 22:05:36 +00:00
Zuul
ac6fa2977c Merge "Prometheus: Update version" 2019-11-19 21:57:03 +00:00
Zuul
3e18a436d2 Merge "Grafana: Update version" 2019-11-19 20:50:37 +00:00
Steve Wilkerson
0c51a9cab8 Prometheus: Update version
This updates the Prometheus version deployed by default from
2.3.2 to 2.12.0

Change-Id: Ic10e02a6b136a7f65fb686f5ef1adf1bcf6a9a9d
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-19 12:03:43 -06:00
Zuul
029f94a776 Merge "Grafana - Update cadvisor labels for k8s 1.16" 2019-11-18 20:55:10 +00:00
Steve Wilkerson
1bfa091203 Grafana: Update version
This updates the Grafana version deployed by default from 5.0.0 to
6.2.0

Change-Id: I39b5405cc3f3fe7754ed6544a8388ff912a4ef58
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-18 08:49:38 -06:00
Zuul
d0b4803b3c Merge "Add zookeeper chart to osh-infra" 2019-11-17 08:45:01 +00:00
Zuul
fb50a4dbee Merge "Fix K8s version" 2019-11-17 08:34:06 +00:00
Zuul
84596d5eba Merge "Add RabbitMQ ingress Network Policy rules" 2019-11-17 07:18:12 +00:00
Tin Lam
7b332076d7 Fix K8s version
Trivial fix to make all kubernetes version consistently 1.16.2.

Change-Id: I51d567c57604150cba2274c153817b4401a8e707
2019-11-17 06:20:33 +00:00
Zuul
7fca2677da Merge "Add default Network Policies for Mariadb Prometheus Exporter" 2019-11-15 16:41:22 +00:00
Steve Wilkerson
608d75ec8d Add zookeeper chart to osh-infra
This proposes adding a zookeeper chart to osh-infra that aligns
with the design patterns laid out by the other charts in osh-infra
and osh.

Change-Id: I25edc58fc951e7f81f7275ade6cf9c97e0afae02
Signed-off-by: Steve Wilkerson <sw5822@att.com>
Co-Authored-By: Steven Fitzpatrick <steven.fitzpatrick@att.com>
2019-11-14 19:51:20 +00:00
Zuul
be29dd6fb6 Merge "Fxing lint errors for Helm 2.16" 2019-11-14 17:33:35 +00:00
Steve Wilkerson
59dac085ce Nagios: Update ceph health check command
This updates the ceph health check command in Nagios to use the
updated plugin that determines the active ceph-mgr instance
endpoint to use before querying for ceph's health. This results in
more robust and reliable reporting of ceph's overall health

Depends-On: https://review.opendev.org/#/c/693900/

Change-Id: I5eeb076e5af3c820dbdcc3cc321cefcb5f85ef8d
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-13 08:51:26 -06:00
Bjoern Teipel
b500d69591 Fxing lint errors for Helm 2.16
This commit fixes helm lint errors when linting against
the recent helm version.

Change-Id: I2a940ad1cea406ba923519cd5be188ee1bc409aa
2019-11-12 11:28:22 -06:00
Tin Lam
b4a422a798 Clean up python script
Trivial change. This patch set cleans up a python script.

- Move the comment to a helm-template comment so the python comments do
not get rendered by helm.
- Remove an unused python module.

Change-Id: Id287ddae8904d2cfa88725277bb97cf027a942c3
Signed-off-by: Tin Lam <tin@irrational.io>
2019-11-11 22:45:38 +00:00
Bharat Khare
ab95e311a3 Grafana - Update cadvisor labels for k8s 1.16
This patch set will implement the grafana metrics related changes
required for kubernetes version upgrade to 1.16. Updates are mostly
specific to cadvisor metric labels. It is to make sure all
existing metrics are scraped and available in Prometheus so that
these can be consumed by Grafana & Nagios.

Change-Id: I74369ac49dd3f7d9f3682dd5318a3818a4d3f178
2019-11-11 17:57:09 +00:00
Zuul
7bcb16379e Merge "Update grafana link" 2019-11-11 09:42:14 +00:00
Zuul
c90ffb11f9 Merge "Grafana gridPos y key resolves true in chart json" 2019-11-11 09:42:13 +00:00
Zuul
cd860e9017 Merge "Add missing pod labels for CronJobs" 2019-11-11 09:38:35 +00:00
Zuul
f504a1709d Merge "Update the constraints url" 2019-11-11 09:38:34 +00:00
Evgeny L
f173d6103f Add default Network Policies for Mariadb Prometheus Exporter
Due to missing default policies for MySQL Prometheus
Exporter the Pod fails to start.

Change-Id: Ib9f013f97a83da0c2e36f2d38e54ae0a906700e5
2019-11-11 07:46:26 +00:00
Zuul
02af18d5dc Merge "Fix search of max sequence number" 2019-11-11 01:08:01 +00:00