This organizes the single node gates for osh-infra by function.
This organization aims to improve the single node gates in the
following ways:
1. Reduce number of services deployed in single node jobs
2. Only deploy Ceph for logging job, as Elasticsearch requires
RGW for snapshot repositories.
3. Use NFS for storage for monitoring job, as Ceph is not a
requirement for any of the services here.
4. Remove duplicate services deployed to multiple single node jobs
5. Remove storage from openstack-support job, as the only service
requiring storage is rabbitmq. Rabbitmq is deployed with
storage enabled in the openstack-helm checks/gates.
This also removes the documentation for the single node deployments,
as those deployments do not make sense with this change. This should
be revisited as a follow-on once we have a clear path forward for
the larger gate refactoring work
Change-Id: I46951f76904fa2ab245a202d55f76019b7503362
Without this patch, there is a dependency between the two
repositories OSH and OSH-infra, which was recently introduced, and
which will cause a circular dependency problem when trying to remove
the duplicated jobs that will appear in OSH.
Change-Id: Ief4461a66f7139ae0650e4a240a3e65800821f78
Required-By: https://review.openstack.org/610481/
Co-Authored-By: Jean-Philippe Evrard <jean-philippe@evrard.me>
This removes the fluentbit sidecars from the ceph-mon and ceph-osd
charts. Instead, we mount /var/log/ceph as a hostpath, and use the
fluentbit daemonset to target the mounted log files instead
This also updates the fluentd configuration to better handle the
correct configuration type for flush_interval (time vs int), as
well as updates the fluentd elasticsearch output values to help
address the gate failures resulting from the Elasticsearch bulk
endpoints failing
Change-Id: If3f2ff6371f267ed72379de25ff463079ba4cddc
This is to update the mgr liveness script to use admin socket
instead of resolving ceph mon fqdn
Change-Id: Id95f78afef44103a834312d0667d49947ee803a4
Co-Authored-By: Jean-Charles Lopez <jl970p@att.com>
This patch set changes the keystone in the k8s-keystone-auth to
be backed by LDAP. It also updates the test to use the LDAP users
instead of created users in the database.
Co-Authored-By: Samuel Pilla <sp516w@att.com>
Change-Id: Ia34dac51b36a300068ad5fd936c48b0f30821a52
Signed-off-by: Tin Lam <tin@irrational.io>
This PS document use of and fixes the anti-affinity function to
properly support hard anti affinity.
Change-Id: I2ec643d7720036b34fc249a2e230b3bed3aac41f
Signed-off-by: Pete Birley <pete@port.direct>
This PS moves to use the hostname, not the pod name for the
instances specific config sections.
Change-Id: If2bc60c9f4f12038e8aa70fbd33a009cdf652b75
Signed-off-by: Pete Birley <pete@port.direct>
This patch set renames the existing apparmor annotation
function to a more generic MAC (Mandatory Access Control)
name to be flexible enough to handle other MAC annotations
in the future.
Change-Id: I98a34484cebc2b420ad8f2664e4aaa84cfb9dca1
This updates the Grafana Ceph dashboards to use templating to
determine which ceph-mgr to use for displaying ceph related
metrics. This required setting the appropriate labels on the
ceph-mgr service to be able to distinguish between releases
Change-Id: Id2eceacadc5b6366d7bc6668bc16ccf5ba878e4a
We see sporadic shutdown hangs that look to be the issue described at
https://jira.mariadb.org/browse/MDEV-15554
Upgrade minor version to address this.
Change-Id: Idf8403b44e871b5a32173bd153a8367519b239ec
This PS resores the kubeadm-aio image to a functioning state, by
updating the requests package.
Change-Id: I706a8ca5661a8e773386c8d82c049e2a9a04e94e
Signed-off-by: Pete Birley <pete@port.direct>
This updates the Nagios image to include an update to the
Elasticsearch plugin that adds the appropriate headers to the
request sent to Elasticsearch. As Elasticsearch >=6.0 no longer
tries to determine the request data type, we need to explicitly
tell Elasticsearch the request body is JSON. Since we use
Elasticsearch 5.6.4 as default, this change will make the
deprecation warnings for the 6.0 breaking change go away.
Change-Id: I0dbd8859ca8d0bd0893832b4edd92742e575598b
This patch set implements the helm toolkit function to generate a
kubernetes network policy manifest based on overrideable values.
This also adds a chart that shuts down all the ingress and egress
traffics in the namespace. This can be used to ensure the
whitelisted network policy works as intended.
Additionally, implementation is done for some infrastructure charts.
Change-Id: I78e87ef3276e948ae4dd2eb462b4b8012251c8c8
Co-Authored-By: Mike Pham <tp6510@att.com>
Signed-off-by: Tin Lam <tin@irrational.io>
Without this patch, there is a dependency between the two
repositories OSH and OSH-infra, which will cause a circular
dependency problem when trying to remove the duplicated jobs
that will appear in OSH.
Change-Id: Ibeee0a853d0c1358519b0391c879137d8a214be2
This PS cleans up the scripts for the k8s k8s keystone auth gate.
Change-Id: I248439f9b8ffa372dfaba5acba0c8c587231d901
Signed-off-by: Pete Birley <pete@port.direct>
This move definitions of openstack-helm-infra into
a newly created zuul.d folder.
The advantage is to simplify readability of gating, and
makes it easier for contributors to step into the gating
of the openstack-helm-* projects.
- zuul.d/playbooks will contain all the playbooks used for gating
- zuul.d/nodesets.yaml contains all the specific nodesets
required by OpenStack-Helm* projects
- zuul.d/project.yaml will be defined in each repo, and will
contain the repo's pipelines information (so this repository's
project.yaml only contains openstack-helm-infra pipelines)
- zuul.d/jobs.yaml will contain all the openstack-helm-*
repositories jobs
This patch also introduces a first common 'lint' playbook
and 'openstack-helm-lint' job, showing how a job can be
re-used across repositories without requiring repetition of
job definition/plays in other repositories.
Change-Id: Id055ddac4da4971b1fb13ac075a7659369cd2b24
kube_node_status_ready and up metrics are obsolete to check the kubernetes
node condition. When a kubelet is down that means node itself in NotReady
state. With 1.3.1 kube-state-metrics exporter kube_node_status_condition
metric provides the status value of the kubelet (essentially node).
https://github.com/kubernetes/kube-state-metrics/blob/master/Documentation
/node-metrics.md
kube_node_status_condition includes condition=Ready and status as true,
flase and unknown. When a kubelet is stopped the status will be unknown
since the kubelet itself will unable to talk to API. In other cases it
will be false. When the node is registered and available it will be set to
true.
Replaced the kube_node_status_ready with kube_node_status_condition and
changed the 1h to 1m and increased the severity to "critical". Also
modified the K8SKubeletDown definitions with 1m and critical sevrity
Implements: Bug 1797133
Closes-Bug: #1797133
Change-Id: I025adb13c9d8642a218dfda1ff30f1577fa8c826
Signed-off-by: Kranthi Kiran Guttikonda <kranthi.guttikonda@b-yond.com>
This changes the image used for various jobs and helm tests in the
osh-infra charts. This replaces the kolla heat image with the loci
based heat image used for jobs and helm tests in openstack-helm in
order to drive consistency
Change-Id: Ie9deedadb7507282fe62723ec4641dd508040364
This updates the helm tests for the fluent-logging chart to make
them more robust in being able to check for indexes defined in the
chart. This is done by calculating the combined flush interval
for both fluentbit and fluentd, and sleeping for at least one
flush cycle to ensure all functional indexes have received logged
events.
Then, the test determines what indexes should exist by checking
all Elasticsearch output configuration entries, determining
whether to use the default logstash-* index or the logstash_prefix
configuration value if it exists. For each of these indexes, the
test checks whether the indexes have successful hits (ie: there
have been successful entries into these indexes)
Change-Id: I36ed7b707491e92da6ac4b422936a1d65c92e0ac
This updates the logging interval values for the Elasticsearch
outputs to integers (20) vs the previous string value (20s)
Change-Id: I681bdaf807ba0136fef3b6dc1c7ddaa689ae77a3