1145 Commits

Author SHA1 Message Date
Zuul
33178a529d Merge "Fluentd: Remove unused liveness port" 2019-01-30 04:12:48 +00:00
Zuul
8028bcb641 Merge "[tiller] Disable monitoring by default, enable in gate" 2019-01-30 04:12:47 +00:00
Zuul
0963980b51 Merge "[Prometheus] Relax disk IO constraints" 2019-01-30 04:12:46 +00:00
Zuul
ba68a8c745 Merge "[Prometheus] Fix filesystem space checks" 2019-01-30 04:12:45 +00:00
Zuul
3fa8fbea1a Merge "[ingress] explicitly specify the Prometheus scrape port" 2019-01-30 04:12:44 +00:00
Pete Birley
bf4713f04b HTK: Support tls secrets on non-fqdn overridden hosts in ingress
This PS adds support for tls secrets on non-fqdn overriden hosts
in ingress rules.

Change-Id: I134af614e7c2ac3fae6eba2bc4bda9f8b41f7f78
Signed-off-by: Pete Birley <pete@port.direct>
2019-01-29 23:34:18 +00:00
Zuul
a6aabe0feb Merge "Liveness probes for OpenVSwitch daemons." 2019-01-29 23:06:07 +00:00
Steve Wilkerson
39410b16bc Fluentd: Remove unused liveness port
This removes an unused port for a previous implementation of the
fluentd liveness probe

Change-Id: I80367bcf6fedc75b3ee7054eba9c382fbb4bc79d
2019-01-29 14:31:50 -06:00
Zuul
4aca509aaf Merge "[CEPH] Clean up PG troubleshooting option specific to Luminous" 2019-01-29 20:23:53 +00:00
Hemachandra Reddy
aef0ff7810 Liveness probes for OpenVSwitch daemons.
Uses ovs-vsctl for ovs-db
Uses ovs-appctl for ovs-vswitchd as "ovs-vsctl show" does not
talk to ovs-vswitchd.

Change-Id: Ia0b84e3546ff1693676ca61370e1344d75b6e308
2019-01-29 20:10:41 +00:00
Zuul
6051d5e450 Merge "Helm-Toolkit: Make ingress manifest work for more than public endpoints" 2019-01-29 20:06:01 +00:00
Chris Wedgwood
a6fa47eea5 [tiller] Disable monitoring by default, enable in gate
Change-Id: Idb7a1f0046e96261a7042d30eedfaea031b27209
2019-01-29 18:57:58 +00:00
Matthew Heler
f48c365cd3 [CEPH] Clean up PG troubleshooting option specific to Luminous
Clean up the PG troubleshooting method that was needed for
Luminous images. Since we are now on Mimic, this function is now
not needed.

Change-Id: Iccb148120410b956c25a1fed5655b3debba3412c
2019-01-29 18:57:23 +00:00
Zuul
7b5d6e9237 Merge "OSH-Infra: Update multinode and aio-monitoring/logging jobs" 2019-01-29 17:03:18 +00:00
Zuul
2de223b863 Merge "Add proxy support to Minikube gate script" 2019-01-29 17:03:17 +00:00
Pete Birley
3eb0517fc9 Helm-Toolkit: Make ingress manifest work for more than public endpoints
This PS enables the ingress manifest function to work for all endpoints
rather than just public.

Change-Id: I3b454bb24a763f51896e845b767fd9d28f5b07dc
Signed-off-by: Pete Birley <pete@port.direct>
2019-01-29 08:53:06 -06:00
Chris Wedgwood
d7808468fc [Prometheus] Relax disk IO constraints
Relax the timing constrains for disk IO to accommodate rotating disks;
a "measured IO" might be the result of a small number of physical IOs,
allow for enough time for a small number of disk rotations (this isn't
perfect but seems to be about right in testing under load).

Change-Id: Ifb067a2218528e5918d2f4b2ba169b6e739084e0
2019-01-29 06:41:51 +00:00
Chris Wedgwood
4fb6ee6e35 [Prometheus] Fix filesystem space checks
Change-Id: Id527ea6e08070cb7d2634417a7c203c1c5c3d97c
2019-01-29 06:34:54 +00:00
Chris Wedgwood
03ee843b22 [ingress] explicitly specify the Prometheus scrape port
Change-Id: I9e191257c436ca6ab74d013feb07bb0ffed2d532
2019-01-29 04:42:26 +00:00
Pete Birley
0a077f8996 HTK: update fqdn hostname lookup to support host keys
This PS adds support for maps containing `host` for use within
the endpoint host lookup functions as well as a simple string

Change-Id: Ifddfb935bf12510a8b8fac25a4a18b4314845230
Signed-off-by: Pete Birley <pete@port.direct>
2019-01-28 15:32:51 -06:00
Pete Birley
633d99c2ff HTK: Update host and port function to call correct host function
This PS updates the host and port function to call the correct
host function to allow ip addresses to be rendered if required.

Change-Id: I55c91bd911875b537a54ac76cda03a126649af80
Signed-off-by: Pete Birley <pete@port.direct>
2019-01-28 12:14:20 -06:00
Drew Walters
fd7add74ac Add proxy support to Minikube gate script
This commit introduces proxy support to the Minikube gate script by
leveraging existing `HTTP_PROXY`, `HTTPS_PROXY`, and `NO_PROXY`
environment variables.  Additionally, this adds the ability to interpret
DNS nameservers when running behind a proxy server and use those in
`/etc/resolv.conf` over the Google DNS servers.

Change-Id: I508dd00fb7df33945e8ee96af250a8eff9db389a
2019-01-28 11:51:12 -06:00
Pete Birley
26fd3f6be3 HTK: support a map for endpoint host lookups
This PS adds support for maps containing `host` for use within
the endpoint host lookup functions as well as a simple string

Change-Id: I21818676e3e907452912b7c7e3c5765e53aebc64
Signed-off-by: Pete Birley <pete@port.direct>
2019-01-28 11:21:29 -06:00
Pete Birley
52f6591f70 HTK: update .gitignore to exclude htk development files
This PS updates the .gitignore to not add the files commonly used
for htk development by default.

Change-Id: Ic7b3711c3311ecef43b55342ae487078b5e004de
Signed-off-by: Pete Birley <pete@port.direct>
2019-01-28 10:56:35 -06:00
Pete Birley
bf3871b739 HTK: support a map for hostname lookups
This PS adds support for maps containing `host` for use within
the hostname lookup functions as well as a simple string.

Change-Id: I6fc5ebfb349c6581d40fe2d8723771d16ba1f9ec
Signed-off-by: Pete Birley <pete@port.direct>
2019-01-28 10:15:54 -06:00
Zuul
f0f1b57b3c Merge "[CEPH] Journal automation and disk cleanup updates" 2019-01-28 06:05:45 +00:00
Zuul
404930ae75 Merge "Helm: Update version to 2.12.3" 2019-01-26 23:45:13 +00:00
Zuul
901fe70298 Merge "Pentest - NC1.0 K8S –Security HTTP Headers Not Present – TCP 6443" 2019-01-26 07:03:51 +00:00
Zuul
d1b77b2bea Merge "Prometheus: Update pod container status alerts" 2019-01-25 19:39:47 +00:00
Zuul
7fe287bf11 Merge "Add liveness probe to fluentd" 2019-01-25 19:39:46 +00:00
Zuul
4132b40d4f Merge "[CEPH] Setup a cronjob to run OSD defrags for FileStore" 2019-01-24 23:51:43 +00:00
Matthew Heler
61b93c6b46 [CEPH] Journal automation and disk cleanup updates
Refactor the OSD Block initialization code that performs clean ups
to use all the commands that ceph-disk zap uses.

Extend the functionality when an OSD initializes to create journal
partitions automatically. For example if /dev/sdc3 is defined as a
journal disk, the chart will automatically create that partition.
The size of the journal partition is determined by the
osd_journal_size that is defined in ceph.conf.

Change the OSD_FORCE_ZAP option to OSD_FORCE_REPAIR to automatically
recreate/self-heal Filestore OSDs. This option will now call a
function to repair a journal disk, and recreate partitions. One
caveat to this, is that the device paritions must be defined (ex.
/dev/sdc1) for a journal. Otherwise the OSD is zapped and re-created
if the whole disk (ex. /dev/sdc) is defined as the journal disk.

Change-Id: Ied131b51605595dce65eb29c0b64cb6af979066e
2019-01-24 11:47:30 -06:00
Steve Wilkerson
ea0b94a052 Helm: Update version to 2.12.3
This updates Helm from version 2.12.1 to 2.12.3

Change-Id: Ie85e4ae3b55ce2e9f67a5e67af1e785540a6637e
2019-01-24 09:09:00 -06:00
Steve Wilkerson
9bb603ed0c Update Helm to version 2.12.1
This updates Helm from version 2.11.0 to 2.12.1

Change-Id: I9bc37c330b068388df9840eb84dfb12e2536c173
2019-01-23 16:43:36 -06:00
Jagan Kavva
c49207819e Pentest - NC1.0 K8S –Security HTTP Headers Not Present – TCP 6443
The server should send an X-Content-Type-Options: nosniff to make sure
the browser does not try to detect a different Content-Type than what is
actually sent (can lead to XSS).

Additionally the server should send an X-Frame-Options: deny to protect
against drag'n drop clickjacking attacks in older browsers.

Change-Id: I779c519cf75bbee23d3a8348291c0fd053e61e4e
2019-01-23 16:21:32 -06:00
Steve Wilkerson
9f5b1a77bc Add liveness probe to fluentd
This adds a liveness probe to the fluentd chart. This probe will
simply perform a tcpSocket check on the same port the readiness
probe executes the check on.

Change-Id: I768b23d36d50d6f6938f5588bea71e97aeb624b9
2019-01-23 11:47:34 -06:00
Steve Wilkerson
87ff958fb8 Prometheus: Update pod container status alerts
This updates the Prometheus pod container status alerts. This
ensures there are alerts defined for ImagePullBackOff,
ErrImagePull, and CreateContainerConfigError errors.

This also updates the Nagios service checks to include correct
checks for those alerts

Change-Id: I91544e7dff8c6aac8c79cd8aa7d8f7bc03adaa9a
2019-01-23 16:26:39 +00:00
Steve Wilkerson
1e40765d88 OSH-Infra: Update multinode and aio-monitoring/logging jobs
This proposes moving the multinode job to a periodic job to
match the approach used in the openstack-helm repo.

This also adds the openstack-exporter to the aio monitoring job as
it was previously missing.

This also proposes moving the aio-logging and aio-monitoring jobs
to voting

Change-Id: Idcd4544e03facdcd2430683b66bd80c79e73a372
2019-01-23 08:49:48 -06:00
Stamatis Katsaounis
032740957e Pin pip to 18.1 to allow build of docker images
Task: 29045
Story: 2004843

This patch pins pip to 18.1 as the latest pip 19.0 has a problem with
--no-cache-dir option. This problem is causing the build of docker
images of mariadb and kubeadm-aio to fail when they upgrade the
setuptools package.

Change-Id: If2b76249eeacec519a6a76605607ba6f3f81ac7d
Signed-off-by: Stamatis Katsaounis <mokats@intracom-telecom.com>
2019-01-23 11:56:10 +02:00
Zuul
067a37f76f Merge "Add exception handling to Kibana Selenium test" 2019-01-22 17:18:30 +00:00
Matthew Heler
d966085321 [CEPH] Setup a cronjob to run OSD defrags for FileStore
Create a cron and associated script to run monthly OSD defrags.
When the script runs it will switch the OSD disk to the CFQ I/O
scheduler to ensure that this is a non-blocking operation for ceph.
While this cron job will run monthly, it will only execute on OSDs
that are HDD based with Filestore.

Change-Id: I06a4679e0cbb3e065974d610606d232cde77e0b2
2019-01-22 04:27:41 +00:00
Meg Heisler
1bf24051c5 Add exception handling to Kibana Selenium test
This adds exception handling to the Kibana Selenium tests
to address the test failures due to TimeoutExceptions when
the dashboard loads slowly. Only TimeoutExceptions are handled
so if there is an issue with the page itself an error will still
cause the gate to fail as intended. When a TimeoutException
occurs an error message is logged and a screenshot is taken
of the current page.

Change-Id: I16cd3a61ffce2e5fdc39bd7731cc068b8a6ec41f
2019-01-21 13:26:43 -06:00
Steve Wilkerson
cd4ec0b4b2 Grafana: Update Ceph dashboards for Mimic release
This updates the Ceph dashboards for Grafana, as some of the ceph
metrics have changed with the Mimic release.  This fixes issues
with the ceph OSD metrics that broke some Grafana panels, and also
removes the Ceph panel for displaying the number of monitors in
quorum, as that metric has been removed in Mimic

Change-Id: If6cbbfa7d2972ddd0e44b29a6c8277188d2d9ff0
2019-01-21 09:25:57 -06:00
Zuul
c09c10443a Merge "[Calico] Update TLS settings for Calico" 2019-01-19 21:57:05 +00:00
Zuul
537912e976 Merge "HTK: Dont display keystone user password in ks-user job" 2019-01-19 05:52:22 +00:00
Zuul
2c7f0cbb49 Merge "[CEPH] Fix a race condition with udev on OSD start" 2019-01-18 23:37:59 +00:00
Matthew Heler
b0da8d78d1 [CEPH] Fix a race condition with udev on OSD start
Under some conditions udev may not trigger correctly and create
the proper uuid symlinks required by Ceph. In order to work around
this we manually create the symlinks.

Change-Id: Icadce2c005864906bcfdae4d28117628c724cc1c
2019-01-18 15:03:27 -06:00
Pete Birley
f7b7f17c12 HTK: Dont display keystone user password in ks-user job
This PS updates the ks user job script to not display the password
on stdout.

Change-Id: I3c11601a409d6d5993c351170c7057217cfabd8a
Signed-off-by: Pete Birley <pete@port.direct>
2019-01-18 20:44:20 +00:00
Dmitrii Kabanov
0c5e2c4830 [Calico] Update TLS settings for Calico
PS provides possibility to use TLS in etcd (for Calico).
The ansible scripts were updated as well.

Change-Id: I522a78043a125660153aaa60f13d61ba8e325e75
2019-01-18 19:53:46 +00:00
Steve Wilkerson
b3097f6a25 Selenium: Add "|| true" to kibana selenium execution
This temporarily adds a "|| true" suffix to the kibana
selenium script execution, as we've noticed rare cases where the
tests fail due to the paths not being ready in time. Once we have
a path forward for waiting to ensure the path is ready,
we should allow for periodic failures of the kibana selenium tests

Change-Id: I6c406ad8907cc87425562dee56eec6b8a0502142
2019-01-18 11:22:29 -06:00