1872 Commits

Author SHA1 Message Date
Zuul
f0306ce33d Merge "Sync wait-for-pods script with the one from openstack-helm" 2019-08-24 07:45:38 +00:00
Pete Birley
a5682e7db3 MairaDB: Move all config to be values driven
This PS moves to drive all mariadb config via the values fed
to the chart.

Change-Id: I4ed3624737af4d5c90b1b5de451a0a0b75a5eda1
Signed-off-by: Pete Birley <pete@port.direct>
2019-08-21 14:08:25 -05:00
Pete Birley
aba044cb0e Mariadb: define timeouts for wsrep
This PS updates the wsrep_provider_options to define the timeouts
explitlcitly for evs.suspect_timeout, gmcast.peer_timeout. Their
defaults are PT5S, and PT3S respectively, which are increased by
a factor of approx 5, to accomdate network instability that may
occur during node outage events.

Change-Id: Ie5cdd06d91299e5e2632b70cb9b50a7ad14f62b1
Signed-off-by: Pete Birley <pete@port.direct>
2019-08-21 14:48:05 +00:00
Zuul
7c2c148fb0 Merge "Enable probes override from values.yaml for ovs" 2019-08-21 12:08:55 +00:00
Zuul
6639d0916b Merge "Enhance HTK Job Manifests to be more flexible" 2019-08-20 17:45:31 +00:00
rajesh.kudaka
2b66685594 Enable probes override from values.yaml for ovs
This commit enables overriding liveness/readiness probes
configurations for openvswitch pods from values.yaml

Change-Id: I4ec2b9e88bf8ed57e8ac9293f333969b63cef335
2019-08-19 16:34:03 +00:00
Chinasubbareddy Mallavarapu
1ff4811f06 [ceph-provisioner] Enable pvc resize feature
This is to enable pvc resize feature so that pvc can be resized when need.

Change-Id: Ib5840b10087b39884cfd2249017c974aac407b30
2019-08-16 16:21:05 -05:00
sg774j
87afa2fb8c Rabbitmq: Correct reset_rabbit function
Made correction to this function to not attempt to delete
/var/lib/rabbitmq/

Change-Id: Ied16be1ec83d528f2660ef96389c3f236983aa79
2019-08-15 18:22:01 +00:00
BARTRA, RICK
f5df62d836 Run rabbitmq container with rabbitmq user
This change makes rabbitmq container run with the rabbitmq user
instead of the root user. As the rabbitmq user doesn't have write
access to '/run' directory, the templates are updated to use the
'/tmp' directory instead which the rabbitmq user has write access
to.

Change-Id: Ia35c3f741fefe3172c93bb042bf8d26bf7672cfc
2019-08-14 17:48:40 +00:00
Zuul
20dafdaddb Merge "Nagios – API Handling – HTTP Security Headers Not Present" 2019-08-14 00:59:23 +00:00
Zuul
a381200e8c Merge "Disable cephfs provisioner in multinode jobs" 2019-08-14 00:48:32 +00:00
Zuul
e11e9734bd Merge "Minikube: Expose Tiller http port for metrics" 2019-08-13 21:50:28 +00:00
Zuul
eb3ec04325 Merge "AIO multinode: Add root user directive to Kubelet" 2019-08-13 16:55:10 +00:00
Zuul
3f0cda712b Merge "Remove stale images from openstack-helm-infra" 2019-08-13 16:43:59 +00:00
Steve Wilkerson
d547063c37 Disable cephfs provisioner in multinode jobs
This disables the cephfs provisioner in the multinode
periodic jobs. It seems the helm tests for the ceph
provisioner chart that test cephfs fail more often than
not in the multinode jobs while passing reliably in the
single node check and gate jobs. As cephfs is still
gated, disabling the cephfs provisioner in the periodic
jobs allows for further investigation into this issue
without causing potential regressions

Change-Id: I36e68cc2e446afac8769fb9ab753105909341f24
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-08-13 14:49:27 +00:00
Drew Walters
354d53c4c3 AIO multinode: Add root user directive to Kubelet
Systemd units run as the root user by default; however, environment
variables in spawned processes are not populated for the root user
unless "User=root" is specified for a particular unit [0]. This change
adds the "User=root" declaration to the Kubelet systemd unit so that
Kubelet will look in the root user's home directory for Docker
configuration information. Without this change, Docker configuration
information, such as authentication keys for private repositories, are
ignored by Kubelet even though the Docker daemon honors them.

[0] https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Environment%20variables%20in%20spawned%20processes

Change-Id: I209de0f4f04c078d39b1e8bf18195e51e965cbf3
Signed-off-by: Drew Walters <andrew.walters@att.com>
2019-08-12 15:56:47 +00:00
Zuul
9b9309fe31 Merge "(postgresql) Cert auth for replication connections" 2019-08-08 21:16:15 +00:00
RAHUL KHIYANI
ac65a37b0b Nagios – API Handling – HTTP Security Headers Not Present
Added new X-Content-Type-Options: nosniff header to make sure the browser
does not try to detect a different Content-Type than what is actually
sent (can lead to XSS)

Added new X-Frame-Options: sameorigin header to protect against
drag and drop clickjacking attacks in older browsers

Added new Content-Security-Policy: script-src self for implementation

Added new HTTP Security header X-XSS-Protection:1 mode=block to
sanitize the page, when a XSS attack is detected, the browser will
prevent rendering of the page

Change-Id: Ic79bbb96484a7f1a497c001883783338fd26a47a
2019-08-07 19:08:48 +00:00
Steve Wilkerson
8573957fce Minikube: Expose Tiller http port for metrics
This updates the Minikube deployment to patch the tiller-deploy
service to add a port definition for the http (44135) port for
tiller, which is used to expose metrics for Prometheus to scrape

Change-Id: I2eb5d4001c37935674ce64012b2744030addc127
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-08-07 13:25:23 -05:00
Steve Wilkerson
443832a8fd Remove stale images from openstack-helm-infra
This removes the artifacts associated with images for libvirt,
mariadb, and vbmc from openstack-helm-infra as these images now
live in openstack-helm-images.

Change-Id: I5c97d2db89068c71ec1a56a5ac17007682711182
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-08-07 08:56:51 -05:00
Zuul
b310caef4f Merge "Grafana: Code for Calico Dashboard" 2019-08-06 21:39:48 +00:00
Zuul
4a8f788532 Merge "Generate CA crt and key if needed" 2019-08-06 18:14:08 +00:00
Hussey, Scott (sh8121)
9c27dd7576 (postgresql) Cert auth for replication connections
- Change the Postgres configuration to use x509 client
  certs for authenticating the connections for replicating
  between Patroni nodes. This is a straightforward solution
  for support credential rotation for the replication user.
  Password authentication is problematic due to the declartive
  nature of helm charts and requiring an existing replication
  connection to replicate the rotated password.

Change-Id: I0c5456a01b3a36fee8ee4c986d25c4a1d807cb77
2019-08-06 00:03:54 -05:00
Zuul
8f749dd061 Merge "RabbitMQ: Dont remove definitions.json and erlang cookie when resetting" 2019-08-02 15:03:18 +00:00
Pete Birley
eef8ea131a RabbitMQ: Dont remove definitions.json and erlang cookie when resetting
This PS udpated the reset node function to leave the assets generated
via init containers in place when resetting the node.

Change-Id: Iac52ca82e95bb372dbcbca0eeea3b262215e9c12
Signed-off-by: Pete Birley <pete@port.direct>
2019-08-02 02:05:00 +00:00
Steve Wilkerson
bc20c6c8b6 Elasticsearch: Add cron job to verify snapshot repositories
This adds a cron job to manually verify all snapshot repositories
are registered to any active master and data nodes. This is to
address scenarios where master and data nodes do not have the
desired snapshot repositories registered following node outages
or reboots

Change-Id: Ie6f42e95c3ca4dc2ec70f2852a2bde11e59ec097
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-08-02 02:02:14 +00:00
Zuul
26ed62352b Merge "Ceph-Client: update configmap name for defragosds cronjob" 2019-08-02 00:21:41 +00:00
Zuul
ea303850cd Merge "Elasticsearch: Manually verify snapshot repositories" 2019-08-01 18:36:37 +00:00
Chinasubbareddy Mallavarapu
acd5d11bc2 Ceph-Client: update configmap name for defragosds cronjob
This is to update configmap names using by defragosds cronjob.

Change-Id: I29608cd8b6ce1e30615a0f92853939d7bbae9972
2019-08-01 12:22:48 -05:00
Zuul
d3d898de1b Merge "Nagios: Updated the alert for Ceph OSD Down" 2019-08-01 16:15:51 +00:00
Cliff Parsons
e059f4f827 Enhance HTK Job Manifests to be more flexible
This patch enhances the HTK job manifest functions so that each job can
be configured to use the desired backoffLimit and activeDeadlineSeconds,
and can mount the command/script from either a configMap or a secret
instead of being confined to using only configMaps.

Change-Id: I5231e53b98e3e55e3e93070876d8694f37ad642d
2019-08-01 09:20:12 -05:00
Pai, Radhika (rp592h)
a37925c7e8 Grafana: Code for Calico Dashboard
Appended the code that will add the calico dashboard to the Grafana. This will
display the felix metrics which are collected by the prometheus.

Change-Id: If18a18949f8093747b3f9ba819e036778c40b84e
2019-07-31 20:53:55 +00:00
Zuul
85b8d62830 Merge "Provide option to switch between dpdk and non-dpdk" 2019-07-31 20:38:22 +00:00
Manuel Buil
a71f1b4d33 Provide option to switch between dpdk and non-dpdk
We can select if we want an image with dpdk support by adding:

FEATURE_GATES=dpdk

That way we can reuse the same script for different distros by using
openstack-helm/tools/deployment/common/get-values-overrides.sh

Change-Id: Ia2c53556be650899fdd67c1ec06f5c68ae63c9d4
Signed-off-by: Manuel Buil <mbuil@suse.com>
2019-07-31 15:54:51 +00:00
Ahmad Mahmoudi
db164a2925 Generate CA crt and key if needed
Generate CA cert and CA key, if they are not present in
the values.

Change-Id: I14610ab66b72ddd5e6e45f57b56968e462416234
2019-07-30 13:16:03 -05:00
Arun Kant
7a8bb7058b Removing deprecated option usage in gatther pod logs logic
As per PR, https://github.com/kubernetes/kubernetes/pull/60210,
in kubectl get show-all option is deprecated and no longer needed.
Presumably now that's the default behavior.

Also in current logs gathering logic, we are interested in capturing
only pod names, so removing that option is harmless.

We are seeing related failures in local CI when kubectl version is
1.15.x. So removing this option.

Change-Id: I3886c792fe28bc8b80504d8c91e9524039131b15
2019-07-30 08:19:38 -07:00
Steve Wilkerson
8130e6bdc5 Elasticsearch: Manually verify snapshot repositories
This updates the script for registering snapshot repositories to
include a manual verification of the repositories created. This
simply allows for inspection of all master and data nodes the
repository is verified with to provide additional visibility into
the state of all repositories

Change-Id: I6e5386386e2b79b1cb0f41fc1f9b78817695f8f3
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-07-24 15:37:23 -05:00
Zuul
17a7eb5cdc Merge "Restore overrides functionality after regression" 2019-07-24 16:25:49 +00:00
Anderson, Craig (ca846m)
ab8c81f2ee Restore overrides functionality after regression
Revert 833d426da8e4b049277ca9847830f6e6beee40c3

https://review.opendev.org/#/c/667022 introduced a regression in the
overrides functionality, which caused the corresponding gate test to
fail. This "fixed" a problem by breaking the override capability.

This patchset reverts the previous to restore override functionality and
make gates green again. Deep copy is added in order to resolve the
original problem that 667022 attempted to resolve.

Change-Id: I6c052c0fabe0067612d6a3d9d3bfac4df59202d7
2019-07-24 12:18:44 +00:00
Chinasubbareddy Mallavarapu
dc66254c42 Ceph-RGW: fix file permision issue
This is to fix the issue we are facing with file permision on the file
/var/lib/ceph/bootstrap-rgw/ceph.keyring since owner of the file
will be root.

This is happening when node with rgw reboots and rgw pods fails at
init after reboot,this is happening on sinlge node deplyoments.

issue:

ceph-rgw-5db485fbd9-dv778  0/1  Init:CrashLoopBackOff   5  6m49s

logs:
+ chown -R ceph. /run/ceph/ /var/lib/ceph/bootstrap-rgw /var/lib/ceph/radosgw
/var/lib/ceph/tmp
chown: changing ownership of
'/var/lib/ceph/bootstrap-rgw/ceph.keyring': Operation not permitted

Change-Id: Idcb648c205053b2f03357b59173e70e02f28688c
2019-07-23 10:52:31 -05:00
Zuul
b7a7e81056 Merge "Updated the CEPH Cluster Health Panel values" 2019-07-22 15:29:36 +00:00
Zuul
f4a9e2b43c Merge "Fix mon_host hosts when hostname contains 'ip'" 2019-07-19 21:24:59 +00:00
Pai, Radhika (rp592h)
47565d2d19 Nagios: Updated the alert for Ceph OSD Down
Earlier the Nagios alert monitor was percent based as in when the percent of OSD
down is greater than 80, it will send alert.
>check_prom_alert!ceph_osd_down_pct_high!CRITICAL- CEPH OSDs down is
more than 80 percent!OK- CEPH OSDs down is less than 80 percent

Updated the code in nagios values.yaml to send alert when even 1 OSD is
down:
>check_prom_alert!ceph_osd_down!CRITICAL- One or more CEPH OSDs are down
>for more than 5 minutes!OK- All the CEPH OSDs are up

Change-Id: Id24c4a0cca64674890dae3599edc0c90d9534e90
2019-07-19 19:25:53 +00:00
Doug Aaser
9a36becf20 Cleanup unused Postgres config values
This patch is part of an effort to cleanup the values.yaml file for
Postgres, which has gotten messy since the introduction of Patroni. This
patch specifically removes unused configuration values which were
causing unnecessary bloat and complexity.

Change-Id: I96180fd9c91200ba7558e58bd503b4ef9ebc183e
2019-07-19 17:16:04 +00:00
Daniel Pawlik
0b58aea135 Fix mon_host hosts when hostname contains 'ip'
Ceph-mon template script parse mon_host in wrong way, when
hostname contains'ip' word, e.g.: airship.

Change-Id: I0a097443d42ad2e9b6be6c61facd7932ddb4b3bb
Story: 2006255
2019-07-19 10:49:50 +00:00
Pete Birley
af270934d4 Rabbit: Eradicate potential crashes in wait job while upgrading cluster
When upgrading/reconfiguring a rabbit cluster its possible that the nodes
will not return the cluster status for some time, this ps allows us to
cope with this much more gracefully than simply crashing a few times, before
proceeding.

Change-Id: Ibf525df9e3a9362282f70e5dbb136430734181fd
Signed-off-by: Pete Birley <pete@port.direct>
2019-07-18 23:07:32 +00:00
Zuul
2c8b18aeb8 Merge "Openvswitch: Fix typo in image overrides" 2019-07-18 20:30:45 +00:00
Zuul
0c3a46ae6e Merge "Helm-Toolkit: Add a function to return quoted csv sting from a list" 2019-07-18 20:15:12 +00:00
Zuul
e29022f8ae Merge "Revert "CI: Make openstack-support and keystone-auth jobs nonvoting"" 2019-07-18 19:47:54 +00:00
Manuel Buil
dc1b4dd1c5 Openvswitch: Fix typo in image overrides
The tag is pointing to a libvirt image. It should point to the
openvswitch image

Change-Id: If95a7b9cce2cadcb644389c28799fff48572c549
Signed-off-by: Manuel Buil <mbuil@suse.com>
2019-07-18 18:43:25 +00:00