2058 Commits

Author SHA1 Message Date
Tin Lam
ac18e6acf9 Fix feature gate envvar overriding
Currently using envsubst to perform substitution of value overrides in
the feature gate caused conflicts as gotpl gets templated into those
overrides. This adds in '%%%REPLACE_${var}%%%' and uses sed to perform
the substitution instead to address the issue.

Change-Id: I9d3d630b53a2f3d828866229a5072bb04440ae15
Signed-off-by: Tin Lam <tin@irrational.io>
2019-12-07 12:22:16 -06:00
Zuul
d216fbf731 Merge "Elasticsearch: Remove unnecessary rbac definitions" 2019-12-06 18:16:06 +00:00
Zuul
bb7c2787c3 Merge "Elasticsearch/Kibana: Update version to 7.1.0" 2019-12-06 18:16:05 +00:00
Tin Lam
daefed7218 Add feature gate capability to OSH-Infra
This patch set adds the feature gate capability to OpenStack-Helm-Infra
repository without depending on the main OpenStack-Helm repository.

Change-Id: I70b8fac4fd2365f8eedcf50519f125eb34534f2f
Signed-off-by: Tin Lam <tlam@omegaprime.dev>
Signed-off-by: Tin Lam <tin@irrational.io>
2019-12-03 16:55:00 -06:00
Zuul
8bd11d1ad2 Merge "[ceph-client] Validate failure domain support for replica count per pool" 2019-12-03 22:23:08 +00:00
Zuul
f9479c31c9 Merge "Create Chart to Deploy Apache Kafka" 2019-12-03 22:02:39 +00:00
Zuul
9632d8719f Merge "Nagios: Add support for arbitrary object definitions via overrides" 2019-12-03 21:09:55 +00:00
Steven Fitzpatrick
e8f3d84ccc Create Chart to Deploy Apache Kafka
This proposes adding a kafka chart to osh-infra that aligns
with the design patterns laid out by the other charts in osh-infra
and osh.

danielqsj's kafka-exporter image is leveraged to deploy a prometheus
exporter for kafka alongside the main application if enabled in
values.yaml

Change-Id: I5997b0994fc3aef9bd1b222c373cc3a013112566
Co-Authored-By: Meghan Heisler <mh783g@att.com>
2019-12-03 11:37:54 -06:00
Steve Wilkerson
fd7067649a Elasticsearch: Remove unnecessary rbac definitions
This removes the cluster role definition from the Elasticsearch
component templates, as these are not needed for the service to
function correctly.

Change-Id: I671272affbed8984a47121187024e4b831937123
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-12-03 09:06:13 -06:00
Steve Wilkerson
6c4404ee4d Nagios: Disable Nagios page tours by default
This disables the Nagios page tours option. This option is enabled
by default, which results in a youtube video being overlaid on
each Nagios page.

Change-Id: Ifd80a8d122dcbe145315b37753a72e1309e1d210
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-12-03 14:48:41 +00:00
Steve Wilkerson
2d3c9575ff Elasticsearch/Kibana: Update version to 7.1.0
This updates the Elasticsearch and Kibana charts to deploy
version 7.1.0. This move required significant changes to both
charts, including: changing elasticsearch masters to a statefulset
to utilize reliable dns names for the discovery process, config
updates to reflect deprecated/updated/removed values, use the
kibana saved objects api for managing index patterns and setting
the default index, and updating the elasticsearch entrypoint
scripts to reflect the use of elastic-keystore for storing s3
credentials instead of defining them in the configuration file

Change-Id: I270d905f266fc15492e47d8376714ba80603e66d
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-12-03 07:43:29 -06:00
Dustin Specker
ae8a6c5d50 refactor(deploy-k8s): remove explicit wait on etcd pod
Using `--network-plugin=cni` for `minikube start` will have minikube
wait for Kubernetes components to spin up and not require the Node to be
in ready status.

Change-Id: I08bf40ac4790955c107e8fee6a004b930c333d16
2019-12-02 19:21:19 +00:00
bw6938
699ea1acba [ceph-client] Validate failure domain support for replica count per pool
Ensure each pool is configured with enough failure domains to
satisfy the pool's replica size requirements. If any pool does
not have enough failure domains to satisfy the pool's replica size,
then fail the ceph deployment.

Change-Id: I9dd1cafd05e81f145d1eb8c916591203946bc8f1
2019-12-02 15:22:54 +00:00
Steve Wilkerson
6f7790e451 Nagios: Add support for arbitrary object definitions via overrides
This adds support for arbitrary object definitions via the conf
key in the Nagios chart. This allows for customizing the
definitions required by different deployment targets instead of
assuming all nagios deployments are monitoring and targeting the
same hosts and executing the same service checks and commands.

This also adds reference overrides to the chart for elasticsearch,
postgresql, and openstack nagios objects that are deployed in the
single and multinode jobs here

Change-Id: I6475ca980447591b5b691220eb841a2ab958e854
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-12-02 13:46:20 +00:00
Oleh Hryhorov
9492a8cde0 Fixing typo in exporter-deployment.yaml PUBLISH_PORT
The patch fixes typo in PUBLISH_PORT and adds quotes for
PUBLISH_PORT because of the fact that it is string values
otherwise it leads to the error below:

error updating the release: rpc error: code = Unknown desc = release
rabbitmq failed: Deployment in version "v1" cannot be handled as
a Deployment: v1.Deployment.Spec: v1.DeploymentSpec.Template: v1.PodTemplateSpec.Spec:
v1.PodSpec.Containers: []v1.Container: v1.Container.Env: []v1.EnvVar: v1.EnvVar.Value:
ReadString: expects " or n, but found 9, error found in #10 byte of ...|,"value":9095},{"nam|...,
bigger context ...|value":"no_sort"},{"name":"PUBLISH_PORT","value":9095},{"name":"LOG_LEVEL","value":"info"},{"name":"|...

Change-Id: I027c91ee48df8eb5b4b2bf3fd28036b8eca47238
2019-11-28 17:26:27 +02:00
Drew Walters
992e82fc1d tools: Sort resolv.conf minikube K8s script
The way that the minikube K8s script orders a host's resolv.conf file
leaves service endpoints inaccessible from the host itself even though
they are accessible within the cluster, leaving the OpenStack client
unusable from the minikube node. This change resolves the service access
issues by reordering the DNS entries in the host's resolv.conf file.

Change-Id: I58bf6d541e59f3049a0e350291e07241f6a6b544
Signed-off-by: Drew Walters <andrew.walters@att.com>
2019-11-25 21:13:29 +00:00
Zuul
2a33842a9f Merge "Move ingress config to separate configmap" 2019-11-25 15:28:21 +00:00
Zuul
05c3ec119b Merge "Grafana: Support multiple datasources" 2019-11-22 16:43:18 +00:00
Steve Wilkerson
97e029e606 Grafana: Support multiple datasources
This updates the Grafana chart to support the definition of
multiple datasources. This moves to defining a template in the
chart's values.yaml file that allows for inline gotpl for
defining an arbitrary number of datasources. This also updates the
grafana dashboards to include a selector for the Prometheus
datasource to use via a drop down selector. This is vetted out in
the federated monitoring job

Change-Id: I55171fed5c2b343130d135d0b42bc96ff11c4712
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-22 14:45:04 +00:00
Steve Wilkerson
0b86616c6f Make keystone-auth job nonvoting
This makes the keystone-auth job nonvoting, until adequate work
can be done to help make the job more reliable. At the moment,
this job seems to be responsible for the majority of the gate job
failures due to what seems to be limitations with the single node
nodesets available

Change-Id: I08f1f10b79e9a5fd82ef7c6d887a03ccb55cceed
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-22 14:44:52 +00:00
Zuul
4a4676b3f7 Merge "Add rally environment cleanup" 2019-11-22 06:26:31 +00:00
Zuul
873838f11f Merge "Prometheus: Update chart to support federation" 2019-11-22 05:27:18 +00:00
Tin Lam
0dd938d1be Add rally environment cleanup
This patch set add command to clean up a rally environment after a helm
test's execution is completed.

Change-Id: I652ee4930e7afb8b278250a0432086a2963a528c
Signed-off-by: Tin Lam <tin@irrational.io>
2019-11-22 04:32:48 +00:00
Zuul
14e712c9df Merge "Re-enable experimental jobs in osh-infra" 2019-11-22 00:08:51 +00:00
Zuul
108f89b208 Merge "Update egress HTK method" 2019-11-22 00:08:50 +00:00
Zuul
2b3d3ef131 Merge "Move charts off using the :latest built tags" 2019-11-21 21:34:04 +00:00
Zuul
7a8fbb17d1 Merge "Grafana: Add support for arbitrary environment variables" 2019-11-21 20:47:47 +00:00
Tin Lam
3121fc24c5 Update egress HTK method
This patch set places logic to generate kubernetes egress network policy
rule based on the dependencies specified in values.yaml. This also sets
up the necessary default network policy for the OSH gate.

Change-Id: I1ac649cc9debb5d1f4ea0a32f506dcda4d8b8536
Signed-off-by: Tin Lam <tin@irrational.io>
2019-11-21 20:05:34 +00:00
Steve Wilkerson
cbeb7f149b Move charts off using the :latest built tags
This updates charts that consume images built from osh-images to
use tags other than the :latest tags. This will be followed up
with the definition of jobs to allow for vetting out of updated
images, as reliance on :latest tags assumes any change merged into
osh-images will result in functionally correct behavior (which has
shown to not be the case traditionally)

Change-Id: I181aa56ed187604dc7583d8081e53cc69eb27310
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-21 19:57:07 +00:00
Steve Wilkerson
eabc9fad64 Re-enable experimental jobs in osh-infra
This adds the experimental jobs back to osh-infra, as they were
erroneously disabled via comments in a previously merged change

Change-Id: Id92c24223f8c22f1a0ff82b62c222b2920ecd929
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-21 13:53:00 -06:00
Zuul
092709d875 Merge "RabbitMQ Exporter: Replace Direct Values w/ HTK" 2019-11-21 18:22:55 +00:00
Zuul
b6828f9e6a Merge "Add ceph metrics to postrun metrics gathering role" 2019-11-21 17:28:43 +00:00
Steven Fitzpatrick
ca6ad711a4 RabbitMQ Exporter: Replace Direct Values w/ HTK
This change replaces direct references to the exporter port
in values.yaml with calls to helm-toolkit lookup functions.

The referenced port number under the network key is removed,
as the helm-toolkit function will return the port number under
the endpoints key.

Change-Id: Ib6f533c49af5a88fca377920d28d5468d7387892
2019-11-21 12:52:55 +00:00
Steve Wilkerson
a4816feda2 Grafana: Add support for arbitrary environment variables
This updates the Grafana chart to support the definition of
arbitrary environment variables to support scenarios where
additional information may be required at runtime for things like
datasource and dashboard provisioning

Change-Id: I95e4abe9030116a440c6d78a1d14dbcaaf743b40
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-21 12:40:04 +00:00
Steve Wilkerson
fbd34421f2 Prometheus: Update chart to support federation
This updates the Prometheus chart to support federation. This
moves to defining the Prometheus configuration file via a template
in the values.yaml file instead of through raw yaml. This allows
for overriding the chart's default configuration wholesale, as
this would be required for a hierarchical federated setup. This
also strips out all of the default rules defined in the chart for
the same reason. There are example rules defined for the various
aspects of OSH's infrastructure in the prometheus/values_overrides
directory that are executed as part of the normal CI jobs. This
also adds a nonvoting federated-monitoring job that vets out the
ability to federate prometheus in a hierarchical fashion with
extremely basic overrides

Change-Id: I0f121ad5e4f80be4c790dc869955c6b299ca9f26
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-21 12:39:56 +00:00
Zuul
0edd3e18de Merge "Update podManagementPolicy for Prometheus and Alertmanager" 2019-11-21 00:41:03 +00:00
Zuul
d85a41c3c1 Merge "ceph-volume integration to ceph-osd charts" 2019-11-20 22:03:45 +00:00
Steve Wilkerson
ef4cbb3b08 Add ceph metrics to postrun metrics gathering role
This updates the gather-prom-metrics role to include gathering
metrics from the active ceph-mgr endpoint

Change-Id: Icb5d27b6a070e9065f6276725bf06dec7d2cbc0d
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-20 21:42:49 +00:00
Steve Wilkerson
c1555920e5 Update podManagementPolicy for Prometheus and Alertmanager
This updates the podManagementPolicy to 'Parallel' for Prometheus
and Alertmanager, as there's no need to handle deploying these
two services in a sequential manner

Change-Id: I2f33b9651bed20c4cb2e0c477ae2227cbf9310cf
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-20 21:37:55 +00:00
Phil Sphicas
6ca136bae4 Ingress chart managed VIP fixes cleanup/startup
When the ingress pod (in routed mode, using a managed vip) moves from
one host to another, it is sometimes observed that: 1. the vip interface
is not removed on the original host, and 2. in some network topologies,
the switch fabric is unable to find the new pod.

This change updates the ingress deployment as follows:

Adds a 5s sleep before the shutdown of the ingress container in order to
allow the preStop action of the ingress-vip container to run completely.

Updates the start action of the ingress-vip-init container to check if
the vip is part of an existing connected subnet, and if so, sends a few
gratuitous ARP messages to let the switch fabric to build its ARP cache.

Change-Id: I784906865358566f42157dc2133569e4cb270cfa
2019-11-20 07:25:50 -08:00
kranthikirang
41684a3c29 ceph-volume integration to ceph-osd charts
ceph-disk has been deprecated and ceph-volume
is available from luminous release. uplifting
ceph-osd charts to use ceph-volume with support
of all below combinations

Filestore:
ceph-disk to ceph-volume
ceph-volume to ceph-volume

Bluestore: (including db, wal combinations)
ceph-disk to ceph-volume
ceph-volume to ceph-volume

support for different osds to run different stores
and upgrade with db, wal combinations

cross upgrade from store isn't supported

Story: ceph-volume-support
Signed-off-by: Kranthi Guttikonda <kranthi.guttikonda@att.com>
Co-Authored-By: Chinasubbareddy Mallavarapu <cr3938@att.com>
Change-Id: Id8b2e1bda0d35fef2cffed6a5ca5876f3888a1c7
2019-11-20 10:02:08 -05:00
Mykyta Karpin
2cffc4e3ae Move ingress config to separate configmap
Currently when updating configuration for mariadb, ingress pods also
are being restarted, however there were no reasons for this.

Change-Id: I398e20541a0e2337e9a5d100f3ef6ce4ad7d0284
2019-11-20 14:14:09 +00:00
Steve Wilkerson
4e7b8a183e Remove elasticsearch ldap test from osh-infra-logging
This removes the elasticsearch-ldap.sh script from the single node
osh-infra-logging job, as this step does not provide any real
value and is tightly coupled to the elasticsearch version used.
This sort of validation should be reserved for smoke tests in
future helm tests for charts

Change-Id: I7ca4805a8809568cb09c8bab6c239c008528fd6a
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-20 12:52:51 +00:00
Chinasubbareddy Mallavarapu
2b42632c9b [ceph-osd] Separate ceph-disk based deployment scripts
This is to create a different folder for ceph-disk based deplyoments
so that it will be easy to maintain when we introduce ceph-volume.

Separate folder for both the tools gives us flexibilty to develop or
fix the issues and commit the code to respective folders without breaking
other tool-based deployments.

Change-Id: Ib0099d292a8692dc6676eb5ed624d5d1ef677cfe
2019-11-19 22:05:36 +00:00
Zuul
ac6fa2977c Merge "Prometheus: Update version" 2019-11-19 21:57:03 +00:00
Zuul
3e18a436d2 Merge "Grafana: Update version" 2019-11-19 20:50:37 +00:00
Steve Wilkerson
0c51a9cab8 Prometheus: Update version
This updates the Prometheus version deployed by default from
2.3.2 to 2.12.0

Change-Id: Ic10e02a6b136a7f65fb686f5ef1adf1bcf6a9a9d
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-19 12:03:43 -06:00
Zuul
029f94a776 Merge "Grafana - Update cadvisor labels for k8s 1.16" 2019-11-18 20:55:10 +00:00
Steve Wilkerson
1bfa091203 Grafana: Update version
This updates the Grafana version deployed by default from 5.0.0 to
6.2.0

Change-Id: I39b5405cc3f3fe7754ed6544a8388ff912a4ef58
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-11-18 08:49:38 -06:00
Zuul
d0b4803b3c Merge "Add zookeeper chart to osh-infra" 2019-11-17 08:45:01 +00:00