1862 Commits

Author SHA1 Message Date
BARTRA, RICK
f5df62d836 Run rabbitmq container with rabbitmq user
This change makes rabbitmq container run with the rabbitmq user
instead of the root user. As the rabbitmq user doesn't have write
access to '/run' directory, the templates are updated to use the
'/tmp' directory instead which the rabbitmq user has write access
to.

Change-Id: Ia35c3f741fefe3172c93bb042bf8d26bf7672cfc
2019-08-14 17:48:40 +00:00
Zuul
20dafdaddb Merge "Nagios – API Handling – HTTP Security Headers Not Present" 2019-08-14 00:59:23 +00:00
Zuul
a381200e8c Merge "Disable cephfs provisioner in multinode jobs" 2019-08-14 00:48:32 +00:00
Zuul
e11e9734bd Merge "Minikube: Expose Tiller http port for metrics" 2019-08-13 21:50:28 +00:00
Zuul
eb3ec04325 Merge "AIO multinode: Add root user directive to Kubelet" 2019-08-13 16:55:10 +00:00
Zuul
3f0cda712b Merge "Remove stale images from openstack-helm-infra" 2019-08-13 16:43:59 +00:00
Steve Wilkerson
d547063c37 Disable cephfs provisioner in multinode jobs
This disables the cephfs provisioner in the multinode
periodic jobs. It seems the helm tests for the ceph
provisioner chart that test cephfs fail more often than
not in the multinode jobs while passing reliably in the
single node check and gate jobs. As cephfs is still
gated, disabling the cephfs provisioner in the periodic
jobs allows for further investigation into this issue
without causing potential regressions

Change-Id: I36e68cc2e446afac8769fb9ab753105909341f24
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-08-13 14:49:27 +00:00
Drew Walters
354d53c4c3 AIO multinode: Add root user directive to Kubelet
Systemd units run as the root user by default; however, environment
variables in spawned processes are not populated for the root user
unless "User=root" is specified for a particular unit [0]. This change
adds the "User=root" declaration to the Kubelet systemd unit so that
Kubelet will look in the root user's home directory for Docker
configuration information. Without this change, Docker configuration
information, such as authentication keys for private repositories, are
ignored by Kubelet even though the Docker daemon honors them.

[0] https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Environment%20variables%20in%20spawned%20processes

Change-Id: I209de0f4f04c078d39b1e8bf18195e51e965cbf3
Signed-off-by: Drew Walters <andrew.walters@att.com>
2019-08-12 15:56:47 +00:00
Zuul
9b9309fe31 Merge "(postgresql) Cert auth for replication connections" 2019-08-08 21:16:15 +00:00
RAHUL KHIYANI
ac65a37b0b Nagios – API Handling – HTTP Security Headers Not Present
Added new X-Content-Type-Options: nosniff header to make sure the browser
does not try to detect a different Content-Type than what is actually
sent (can lead to XSS)

Added new X-Frame-Options: sameorigin header to protect against
drag and drop clickjacking attacks in older browsers

Added new Content-Security-Policy: script-src self for implementation

Added new HTTP Security header X-XSS-Protection:1 mode=block to
sanitize the page, when a XSS attack is detected, the browser will
prevent rendering of the page

Change-Id: Ic79bbb96484a7f1a497c001883783338fd26a47a
2019-08-07 19:08:48 +00:00
Steve Wilkerson
8573957fce Minikube: Expose Tiller http port for metrics
This updates the Minikube deployment to patch the tiller-deploy
service to add a port definition for the http (44135) port for
tiller, which is used to expose metrics for Prometheus to scrape

Change-Id: I2eb5d4001c37935674ce64012b2744030addc127
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-08-07 13:25:23 -05:00
Steve Wilkerson
443832a8fd Remove stale images from openstack-helm-infra
This removes the artifacts associated with images for libvirt,
mariadb, and vbmc from openstack-helm-infra as these images now
live in openstack-helm-images.

Change-Id: I5c97d2db89068c71ec1a56a5ac17007682711182
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-08-07 08:56:51 -05:00
Zuul
b310caef4f Merge "Grafana: Code for Calico Dashboard" 2019-08-06 21:39:48 +00:00
Zuul
4a8f788532 Merge "Generate CA crt and key if needed" 2019-08-06 18:14:08 +00:00
Hussey, Scott (sh8121)
9c27dd7576 (postgresql) Cert auth for replication connections
- Change the Postgres configuration to use x509 client
  certs for authenticating the connections for replicating
  between Patroni nodes. This is a straightforward solution
  for support credential rotation for the replication user.
  Password authentication is problematic due to the declartive
  nature of helm charts and requiring an existing replication
  connection to replicate the rotated password.

Change-Id: I0c5456a01b3a36fee8ee4c986d25c4a1d807cb77
2019-08-06 00:03:54 -05:00
Zuul
8f749dd061 Merge "RabbitMQ: Dont remove definitions.json and erlang cookie when resetting" 2019-08-02 15:03:18 +00:00
Pete Birley
eef8ea131a RabbitMQ: Dont remove definitions.json and erlang cookie when resetting
This PS udpated the reset node function to leave the assets generated
via init containers in place when resetting the node.

Change-Id: Iac52ca82e95bb372dbcbca0eeea3b262215e9c12
Signed-off-by: Pete Birley <pete@port.direct>
2019-08-02 02:05:00 +00:00
Steve Wilkerson
bc20c6c8b6 Elasticsearch: Add cron job to verify snapshot repositories
This adds a cron job to manually verify all snapshot repositories
are registered to any active master and data nodes. This is to
address scenarios where master and data nodes do not have the
desired snapshot repositories registered following node outages
or reboots

Change-Id: Ie6f42e95c3ca4dc2ec70f2852a2bde11e59ec097
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-08-02 02:02:14 +00:00
Zuul
26ed62352b Merge "Ceph-Client: update configmap name for defragosds cronjob" 2019-08-02 00:21:41 +00:00
Zuul
ea303850cd Merge "Elasticsearch: Manually verify snapshot repositories" 2019-08-01 18:36:37 +00:00
Chinasubbareddy Mallavarapu
acd5d11bc2 Ceph-Client: update configmap name for defragosds cronjob
This is to update configmap names using by defragosds cronjob.

Change-Id: I29608cd8b6ce1e30615a0f92853939d7bbae9972
2019-08-01 12:22:48 -05:00
Zuul
d3d898de1b Merge "Nagios: Updated the alert for Ceph OSD Down" 2019-08-01 16:15:51 +00:00
Pai, Radhika (rp592h)
a37925c7e8 Grafana: Code for Calico Dashboard
Appended the code that will add the calico dashboard to the Grafana. This will
display the felix metrics which are collected by the prometheus.

Change-Id: If18a18949f8093747b3f9ba819e036778c40b84e
2019-07-31 20:53:55 +00:00
Zuul
85b8d62830 Merge "Provide option to switch between dpdk and non-dpdk" 2019-07-31 20:38:22 +00:00
Manuel Buil
a71f1b4d33 Provide option to switch between dpdk and non-dpdk
We can select if we want an image with dpdk support by adding:

FEATURE_GATES=dpdk

That way we can reuse the same script for different distros by using
openstack-helm/tools/deployment/common/get-values-overrides.sh

Change-Id: Ia2c53556be650899fdd67c1ec06f5c68ae63c9d4
Signed-off-by: Manuel Buil <mbuil@suse.com>
2019-07-31 15:54:51 +00:00
Ahmad Mahmoudi
db164a2925 Generate CA crt and key if needed
Generate CA cert and CA key, if they are not present in
the values.

Change-Id: I14610ab66b72ddd5e6e45f57b56968e462416234
2019-07-30 13:16:03 -05:00
Arun Kant
7a8bb7058b Removing deprecated option usage in gatther pod logs logic
As per PR, https://github.com/kubernetes/kubernetes/pull/60210,
in kubectl get show-all option is deprecated and no longer needed.
Presumably now that's the default behavior.

Also in current logs gathering logic, we are interested in capturing
only pod names, so removing that option is harmless.

We are seeing related failures in local CI when kubectl version is
1.15.x. So removing this option.

Change-Id: I3886c792fe28bc8b80504d8c91e9524039131b15
2019-07-30 08:19:38 -07:00
Steve Wilkerson
8130e6bdc5 Elasticsearch: Manually verify snapshot repositories
This updates the script for registering snapshot repositories to
include a manual verification of the repositories created. This
simply allows for inspection of all master and data nodes the
repository is verified with to provide additional visibility into
the state of all repositories

Change-Id: I6e5386386e2b79b1cb0f41fc1f9b78817695f8f3
Signed-off-by: Steve Wilkerson <sw5822@att.com>
2019-07-24 15:37:23 -05:00
Zuul
17a7eb5cdc Merge "Restore overrides functionality after regression" 2019-07-24 16:25:49 +00:00
Anderson, Craig (ca846m)
ab8c81f2ee Restore overrides functionality after regression
Revert 833d426da8e4b049277ca9847830f6e6beee40c3

https://review.opendev.org/#/c/667022 introduced a regression in the
overrides functionality, which caused the corresponding gate test to
fail. This "fixed" a problem by breaking the override capability.

This patchset reverts the previous to restore override functionality and
make gates green again. Deep copy is added in order to resolve the
original problem that 667022 attempted to resolve.

Change-Id: I6c052c0fabe0067612d6a3d9d3bfac4df59202d7
2019-07-24 12:18:44 +00:00
Chinasubbareddy Mallavarapu
dc66254c42 Ceph-RGW: fix file permision issue
This is to fix the issue we are facing with file permision on the file
/var/lib/ceph/bootstrap-rgw/ceph.keyring since owner of the file
will be root.

This is happening when node with rgw reboots and rgw pods fails at
init after reboot,this is happening on sinlge node deplyoments.

issue:

ceph-rgw-5db485fbd9-dv778  0/1  Init:CrashLoopBackOff   5  6m49s

logs:
+ chown -R ceph. /run/ceph/ /var/lib/ceph/bootstrap-rgw /var/lib/ceph/radosgw
/var/lib/ceph/tmp
chown: changing ownership of
'/var/lib/ceph/bootstrap-rgw/ceph.keyring': Operation not permitted

Change-Id: Idcb648c205053b2f03357b59173e70e02f28688c
2019-07-23 10:52:31 -05:00
Zuul
b7a7e81056 Merge "Updated the CEPH Cluster Health Panel values" 2019-07-22 15:29:36 +00:00
Zuul
f4a9e2b43c Merge "Fix mon_host hosts when hostname contains 'ip'" 2019-07-19 21:24:59 +00:00
Pai, Radhika (rp592h)
47565d2d19 Nagios: Updated the alert for Ceph OSD Down
Earlier the Nagios alert monitor was percent based as in when the percent of OSD
down is greater than 80, it will send alert.
>check_prom_alert!ceph_osd_down_pct_high!CRITICAL- CEPH OSDs down is
more than 80 percent!OK- CEPH OSDs down is less than 80 percent

Updated the code in nagios values.yaml to send alert when even 1 OSD is
down:
>check_prom_alert!ceph_osd_down!CRITICAL- One or more CEPH OSDs are down
>for more than 5 minutes!OK- All the CEPH OSDs are up

Change-Id: Id24c4a0cca64674890dae3599edc0c90d9534e90
2019-07-19 19:25:53 +00:00
Doug Aaser
9a36becf20 Cleanup unused Postgres config values
This patch is part of an effort to cleanup the values.yaml file for
Postgres, which has gotten messy since the introduction of Patroni. This
patch specifically removes unused configuration values which were
causing unnecessary bloat and complexity.

Change-Id: I96180fd9c91200ba7558e58bd503b4ef9ebc183e
2019-07-19 17:16:04 +00:00
Daniel Pawlik
0b58aea135 Fix mon_host hosts when hostname contains 'ip'
Ceph-mon template script parse mon_host in wrong way, when
hostname contains'ip' word, e.g.: airship.

Change-Id: I0a097443d42ad2e9b6be6c61facd7932ddb4b3bb
Story: 2006255
2019-07-19 10:49:50 +00:00
Pete Birley
af270934d4 Rabbit: Eradicate potential crashes in wait job while upgrading cluster
When upgrading/reconfiguring a rabbit cluster its possible that the nodes
will not return the cluster status for some time, this ps allows us to
cope with this much more gracefully than simply crashing a few times, before
proceeding.

Change-Id: Ibf525df9e3a9362282f70e5dbb136430734181fd
Signed-off-by: Pete Birley <pete@port.direct>
2019-07-18 23:07:32 +00:00
Zuul
2c8b18aeb8 Merge "Openvswitch: Fix typo in image overrides" 2019-07-18 20:30:45 +00:00
Zuul
0c3a46ae6e Merge "Helm-Toolkit: Add a function to return quoted csv sting from a list" 2019-07-18 20:15:12 +00:00
Zuul
e29022f8ae Merge "Revert "CI: Make openstack-support and keystone-auth jobs nonvoting"" 2019-07-18 19:47:54 +00:00
Manuel Buil
dc1b4dd1c5 Openvswitch: Fix typo in image overrides
The tag is pointing to a libvirt image. It should point to the
openvswitch image

Change-Id: If95a7b9cce2cadcb644389c28799fff48572c549
Signed-off-by: Manuel Buil <mbuil@suse.com>
2019-07-18 18:43:25 +00:00
Pete Birley
af17153627 RabbitMQ: prune any extra nodes from cluster if scaling down
This PS updates the cluster wait job to prune any extra nodes from
the cluster if scaling down.

Change-Id: I58d22121a07cd99448add62502582a6873776622
Signed-off-by: Pete Birley <pete@port.direct>
2019-07-18 17:21:37 +00:00
cheng li
776885458a Revert "CI: Make openstack-support and keystone-auth jobs nonvoting"
This reverts commit 5e3f729ffe5692e6e37d0fe6378906662d94bbd0.

Change-Id: I65cb5d24f0538fbd0d6cd28e5e6313e679d87655
2019-07-17 14:06:21 +00:00
Pete Birley
e96bdd9fb6 Ingress: Clean up tmp dir entirely on container start
This PS cleans up the container dir entirely on container restart,
as sometimes remnets of previous runs can cause issues.

Change-Id: I873667a8a57bca6096cbe777ee83ef8648a368d4
Signed-off-by: Pete Birley <pete@port.direct>
2019-07-16 01:21:02 +00:00
Alexander Noskov
3b5a1c7909 Take dnsPolicy from .Values.pod.dns_policy variable
Change-Id: Iae7caa5bdefe7749231c031c6003591a6251fa97
2019-07-15 17:31:16 +00:00
Zuul
769d0980f0 Merge "Prometheus: Fix volume utilization alert expression" 2019-07-14 04:49:18 +00:00
Zuul
e01741589a Merge "Tenant-Ceph: Enable cephfs storage class provisioning" 2019-07-13 16:16:54 +00:00
Zuul
79c9777bf4 Merge "Remove quotes for bind-address in ingress Chart" 2019-07-13 14:21:48 +00:00
Alexander Noskov
0eff94f51c Remove quotes for bind-address in ingress Chart
Currently, we are getting `bind-address: null` in ingress-conf for ingress pod in kube-system namespace
In that case, nginx starting on 0.0.0.0:80 which breaks other ingress controllers, such as maas-ingress.
All further ingress controllers can't start because they can't bind on 80 port.

Change-Id: Ie7e9563bf14fe347969bea0d3c900c8d87d06de0
2019-07-12 17:10:00 -05:00
Drew Walters
8ba46703ee CI: Restore Xenial compatibility in K8s script
Recently, the Minikube gate script was modified to support Ubuntu Bionic
[0]; however, the change made the script incompatible with Ubuntu Xenial
because libxtables12 is not available on Ubuntu Xenial. OpenStack-Helm
still supports Ubuntu Xenial, and this script should too.

This change modifies the gate script to install iptables instead of
libxtables12. The iptables package depends on libxtables11 on Ubuntu
Xenial and libxtables12 on Ubuntu Bionic, so this achieves the same
result.

[0] https://review.opendev.org/650523

Change-Id: I5afbcfeca6e7b30857a44aed35a360595eeb5037
Signed-off-by: Drew Walters <andrew.walters@att.com>
2019-07-12 13:50:22 +00:00