915 Commits

Author SHA1 Message Date
Steve Wilkerson
538d51e991 Organize aio gates by function
This organizes the single node gates for osh-infra by function.
This organization aims to improve the single node gates in the
following ways:

1. Reduce number of services deployed in single node jobs
2. Only deploy Ceph for logging job, as Elasticsearch requires
   RGW for snapshot repositories.
3. Use NFS for storage for monitoring job, as Ceph is not a
   requirement for any of the services here.
4. Remove duplicate services deployed to multiple single node jobs
5. Remove storage from openstack-support job, as the only service
   requiring storage is rabbitmq. Rabbitmq is deployed with
   storage enabled in the openstack-helm checks/gates.

This also removes the documentation for the single node deployments,
as those deployments do not make sense with this change. This should
be revisited as a follow-on once we have a clear path forward for
the larger gate refactoring work

Change-Id: I46951f76904fa2ab245a202d55f76019b7503362
2018-10-19 12:28:18 -05:00
Chris Wedgwood
d9457c8860 Remove dependency to OSH repository of new jobs
Without this patch, there is a dependency between the two
repositories OSH and OSH-infra, which was recently introduced, and
which will cause a circular dependency problem when trying to remove
the duplicated jobs that will appear in OSH.

Change-Id: Ief4461a66f7139ae0650e4a240a3e65800821f78
Required-By: https://review.openstack.org/610481/
Co-Authored-By: Jean-Philippe Evrard <jean-philippe@evrard.me>
2018-10-18 21:06:21 +00:00
Zuul
27ea2a53a6 Merge "Fix grep logic around weighting OSDs during ceph-client chart." 2018-10-18 09:04:29 +00:00
Zuul
cd4b8e9b87 Merge "Ceph: Remove fluentbit sidecars, mount hostpath for logs" 2018-10-17 21:41:38 +00:00
Matthew Heler
0de1d23895 Fix grep logic around weighting OSDs during ceph-client chart.
Change-Id: I7831ac07a53b9aaf3000e9f64bf8c17344723a8f
2018-10-17 15:58:24 -05:00
Steve Wilkerson
92717bdc72 Ceph: Remove fluentbit sidecars, mount hostpath for logs
This removes the fluentbit sidecars from the ceph-mon and ceph-osd
charts. Instead, we mount /var/log/ceph as a hostpath, and use the
fluentbit daemonset to target the mounted log files instead

This also updates the fluentd configuration to better handle the
correct configuration type for flush_interval (time vs int), as
well as updates the fluentd elasticsearch output values to help
address the gate failures resulting from the Elasticsearch bulk
endpoints failing

Change-Id: If3f2ff6371f267ed72379de25ff463079ba4cddc
2018-10-17 11:05:03 -05:00
Chinasubbareddy M
793b3631b5 Ceph-mgr: make liveness to check through admin scoket
This is to update the mgr liveness script to use admin socket
instead of resolving ceph  mon fqdn

Change-Id: Id95f78afef44103a834312d0667d49947ee803a4
Co-Authored-By: Jean-Charles Lopez <jl970p@att.com>
2018-10-17 14:40:42 +00:00
Zuul
b3b4e6858b Merge "Add LDAP support for k8s-keystone-auth in gate" 2018-10-17 08:39:14 +00:00
Zuul
1b7240c64c Merge "Secure pool during deployment" 2018-10-17 07:37:38 +00:00
Samuel Pilla
6fe001361a Add LDAP support for k8s-keystone-auth in gate
This patch set changes the keystone in the k8s-keystone-auth to
be backed by LDAP. It also updates the test to use the LDAP users
instead of created users in the database.

Co-Authored-By: Samuel Pilla <sp516w@att.com>
Change-Id: Ia34dac51b36a300068ad5fd936c48b0f30821a52
Signed-off-by: Tin Lam <tin@irrational.io>
2018-10-17 06:19:20 +00:00
Jean-Charles Lopez
55f1d2db57 Secure pool during deployment
Change-Id: Ifbeb956ab2c015deaed501ee4bff22dfc1e0404f
2018-10-17 04:53:53 +00:00
Pete Birley
be7b01d798 Helm-Toolkit: Document and fix the anti-affinity function
This PS document use of and fixes the anti-affinity function to
properly support hard anti affinity.

Change-Id: I2ec643d7720036b34fc249a2e230b3bed3aac41f
Signed-off-by: Pete Birley <pete@port.direct>
2018-10-17 04:50:02 +00:00
Zuul
7d3bda1307 Merge "Ceph-RGW: Use hostname not podname for pod specific config" 2018-10-17 04:24:49 +00:00
Zuul
51bab02b24 Merge "Rename mandatory access control annotation func" 2018-10-17 04:24:40 +00:00
Zuul
21f46d294b Merge "[Open vSwitch] Remove auto_bridge_add support" 2018-10-17 04:23:52 +00:00
Zuul
23fba51fbb Merge "[MariaDB] Bump to version 10.2.18 to avoid shutdown hangs" 2018-10-17 04:23:51 +00:00
Zuul
570355b1d9 Merge "Initialize OSDs with a crush weight of 0 to prevent automatic rebalancing." 2018-10-17 02:45:45 +00:00
Pete Birley
a01e2db6ab Ceph-RGW: Use hostname not podname for pod specific config
This PS moves to use the hostname, not the pod name for the
instances specific config sections.

Change-Id: If2bc60c9f4f12038e8aa70fbd33a009cdf652b75
Signed-off-by: Pete Birley <pete@port.direct>
2018-10-17 01:38:34 +00:00
Cliff Parsons
c5b10d155f Rename mandatory access control annotation func
This patch set renames the existing apparmor annotation
function to a more generic MAC (Mandatory Access Control)
name to be flexible enough to handle other MAC annotations
in the future.

Change-Id: I98a34484cebc2b420ad8f2664e4aaa84cfb9dca1
2018-10-17 01:35:49 +00:00
Matthew Heler
5efac315f7 Initialize OSDs with a crush weight of 0 to prevent automatic rebalancing.
Weight the OSDs based on reported disk size when ceph-client chart runs.

Change-Id: I9f4080a9843f1a63564cf71154841b351382bfe2
2018-10-16 21:33:49 +00:00
Steve Wilkerson
f3d8bda9d6 Grafana: Support multiple Ceph clusters with dashboards
This updates the Grafana Ceph dashboards to use templating to
determine which ceph-mgr to use for displaying ceph related
metrics.  This required setting the appropriate labels on the
ceph-mgr service to be able to distinguish between releases

Change-Id: Id2eceacadc5b6366d7bc6668bc16ccf5ba878e4a
2018-10-16 21:32:13 +00:00
Chris Wedgwood
8dad346f3f [MariaDB] Bump to version 10.2.18 to avoid shutdown hangs
We see sporadic shutdown hangs that look to be the issue described at
https://jira.mariadb.org/browse/MDEV-15554

Upgrade minor version to address this.

Change-Id: Idf8403b44e871b5a32173bd153a8367519b239ec
2018-10-16 21:30:22 +00:00
Pete Birley
a4111037b0 Gate: Fix kubeadm-aio image
This PS resores the kubeadm-aio image to a functioning state, by
updating the requests package.

Change-Id: I706a8ca5661a8e773386c8d82c049e2a9a04e94e
Signed-off-by: Pete Birley <pete@port.direct>
2018-10-16 16:09:49 -05:00
Zuul
6e092c908c Merge "Externalize some repo URL vars to allow runtime modification" 2018-10-16 00:04:06 +00:00
Zuul
580522c42a Merge "Ceph-client: make pool creation depedent on ceph-mgr service" 2018-10-15 22:05:40 +00:00
Zuul
1f9c8d7f42 Merge "Nagios: Update image with Elasticsearch plugin headers" 2018-10-15 17:58:17 +00:00
Zuul
b3e777c596 Merge "Add network policy toolkit function" 2018-10-15 17:45:35 +00:00
Roman Gorshunov
da31cacafd Externalize some repo URL vars to allow runtime modification
This is to be able to use local mirror of certain packages.

Change-Id: Ia06c6df0628ce5a44ed072c875eaa65d1343c65d
2018-10-15 17:10:10 +00:00
Chinasubbareddy M
616aecd80a Ceph-client: make pool creation depedent on ceph-mgr service
This is to add dependency for pool creation untill ceph-mgr fully up.

Change-Id: Id3111810a855bedff62970091b225358c269cecd
2018-10-15 10:00:27 -05:00
Steve Wilkerson
19248c11e9 Nagios: Update image with Elasticsearch plugin headers
This updates the Nagios image to include an update to the
Elasticsearch plugin that adds the appropriate headers to the
request sent to Elasticsearch. As Elasticsearch >=6.0 no longer
tries to determine the request data type, we need to explicitly
tell Elasticsearch the request body is JSON. Since we use
Elasticsearch 5.6.4 as default, this change will make the
deprecation warnings for the 6.0 breaking change go away.

Change-Id: I0dbd8859ca8d0bd0893832b4edd92742e575598b
2018-10-15 14:20:22 +00:00
Tin Lam
92e68d33ea Add network policy toolkit function
This patch set implements the helm toolkit function to generate a
kubernetes network policy manifest based on overrideable values.
This also adds a chart that shuts down all the ingress and egress
traffics in the namespace. This can be used to ensure the
whitelisted network policy works as intended.

Additionally, implementation is done for some infrastructure charts.

Change-Id: I78e87ef3276e948ae4dd2eb462b4b8012251c8c8
Co-Authored-By: Mike Pham <tp6510@att.com>
Signed-off-by: Tin Lam <tin@irrational.io>
2018-10-15 13:50:50 +00:00
Jean-Philippe Evrard
0dcceacf7d Remove dependency to OSH repository for test jobs
Without this patch, there is a dependency between the two
repositories OSH and OSH-infra, which will cause a circular
dependency problem when trying to remove the duplicated jobs
that will appear in OSH.

Change-Id: Ibeee0a853d0c1358519b0391c879137d8a214be2
2018-10-15 13:34:08 +02:00
kranthi guttikonda
549bf29fd8 cronjob-checkPGs failure fix
Added role and rolebindings to fix permissions.
Added volumes definitions for ceph-bin, ceph-etc
and ceph-client-adminkeyring
serviceaccount and node selectors

Implements: Bug 1797589
Closes-Bug: #1797589
Change-Id: Ib0e77e088c6aa82e441aba72bebc4b258deb88c4
Signed-off-by: Kranthi Kiran Guttikonda <kranthi.guttikonda@b-yond.com>
2018-10-13 18:45:10 -04:00
Zuul
be7dbf6c28 Merge "[MariaDB] Update/remove deprecated configuration" 2018-10-13 21:40:48 +00:00
Zuul
75ea67e591 Merge "Fluent-logging: Update helm tests for checking index entries" 2018-10-13 03:11:39 +00:00
Zuul
c39b29e351 Merge "Fluentd: Update logging interval values" 2018-10-13 03:02:04 +00:00
Zuul
016cc39c9f Merge "Gate: Cleanup scripts for k8s keystone auth gate" 2018-10-12 23:38:52 +00:00
Zuul
bfb1c2a498 Merge "Replace docker-py with docker" 2018-10-12 22:15:15 +00:00
Zuul
cd50e50eb3 Merge "Charts: Update heat image used for jobs and helm tests" 2018-10-12 20:35:14 +00:00
Pete Birley
8bb71f6659 Gate: Cleanup scripts for k8s keystone auth gate
This PS cleans up the scripts for the k8s k8s keystone auth gate.

Change-Id: I248439f9b8ffa372dfaba5acba0c8c587231d901
Signed-off-by: Pete Birley <pete@port.direct>
2018-10-12 13:43:41 -05:00
Jean-Philippe Evrard
100c900da0 Regroup OpenStack-Helm* gating under a folder
This move definitions of openstack-helm-infra into
a newly created zuul.d folder.

The advantage is to simplify readability of gating, and
makes it easier for contributors to step into the gating
of the openstack-helm-* projects.

- zuul.d/playbooks will contain all the playbooks used for gating
- zuul.d/nodesets.yaml contains all the specific nodesets
  required by OpenStack-Helm* projects
- zuul.d/project.yaml will be defined in each repo, and will
  contain the repo's pipelines information (so this repository's
  project.yaml only contains openstack-helm-infra pipelines)
- zuul.d/jobs.yaml will contain all the openstack-helm-*
  repositories jobs

This patch also introduces a first common 'lint' playbook
and 'openstack-helm-lint' job, showing how a job can be
re-used across repositories without requiring repetition of
job definition/plays in other repositories.

Change-Id: Id055ddac4da4971b1fb13ac075a7659369cd2b24
2018-10-12 15:13:12 +02:00
kranthi guttikonda
f995680e2a Prometheus kubelet.rules change
kube_node_status_ready and up metrics are obsolete to check the kubernetes
node condition. When a kubelet is down that means node itself in NotReady
state. With 1.3.1 kube-state-metrics exporter kube_node_status_condition
metric provides the status value of the kubelet (essentially node).
https://github.com/kubernetes/kube-state-metrics/blob/master/Documentation
/node-metrics.md

kube_node_status_condition includes condition=Ready and status as true,
flase and unknown. When a kubelet is stopped the status will be unknown
since the kubelet itself will unable to talk to API. In other cases it
will be false. When the node is registered and available it will be set to
true.

Replaced the kube_node_status_ready with kube_node_status_condition and
changed the 1h to 1m and increased the severity to "critical". Also
modified the K8SKubeletDown definitions with 1m and critical sevrity

Implements: Bug 1797133
Closes-Bug: #1797133
Change-Id: I025adb13c9d8642a218dfda1ff30f1577fa8c826
Signed-off-by: Kranthi Kiran Guttikonda <kranthi.guttikonda@b-yond.com>
2018-10-11 16:31:16 -04:00
Steve Wilkerson
c7cbb9f4dd Charts: Update heat image used for jobs and helm tests
This changes the image used for various jobs and helm tests in the
osh-infra charts. This replaces the kolla heat image with the loci
based heat image used for jobs and helm tests in openstack-helm in
order to drive consistency

Change-Id: Ie9deedadb7507282fe62723ec4641dd508040364
2018-10-11 14:47:58 -05:00
Steve Wilkerson
78283495f0 Fluent-logging: Update helm tests for checking index entries
This updates the helm tests for the fluent-logging chart to make
them more robust in being able to check for indexes defined in the
chart.  This is done by calculating the combined flush interval
for both fluentbit and fluentd, and sleeping for at least one
flush cycle to ensure all functional indexes have received logged
events.

Then, the test determines what indexes should exist by checking
all Elasticsearch output configuration entries, determining
whether to use the default logstash-* index or the logstash_prefix
configuration value if it exists.  For each of these indexes, the
test checks whether the indexes have successful hits (ie: there
have been successful entries into these indexes)

Change-Id: I36ed7b707491e92da6ac4b422936a1d65c92e0ac
2018-10-11 13:28:30 -05:00
Chris Wedgwood
8554bdcbef [MariaDB] Update/remove deprecated configuration
Change-Id: I18aa87602b63ecd051c21e007aff8cadccdd0cda
2018-10-11 15:31:31 +00:00
Steve Wilkerson
9b5d4d9f17 Fluentd: Update logging interval values
This updates the logging interval values for the Elasticsearch
outputs to integers (20) vs the previous string value (20s)

Change-Id: I681bdaf807ba0136fef3b6dc1c7ddaa689ae77a3
2018-10-11 09:05:00 -05:00
Zuul
e231a7c5fd Merge "Elasticsearch: Update log4j2 configuration settings" 2018-10-10 19:23:47 +00:00
Zuul
922d7d3d26 Merge "Charts: Update helm test pod templates" 2018-10-10 06:00:44 +00:00
Zuul
c84dfd8122 Merge "Add configMap hash to annotation" 2018-10-09 22:23:10 +00:00
Zuul
9f63330f0b Merge "Add missing labels to cronJobs" 2018-10-09 21:49:41 +00:00