This uses the same approach as the mariadb role (and others).
Closes-Bug: #1928193
Co-Authored-By: John Garbutt <johng@stackhpc.com>
Change-Id: I79a7a8c80327cfd9ef31d17fe71f450a181a638c
Add "enable_prometheus_etcd_integration" configuration parameter which
can be used to configure Prometheus to scrape etcd metrics endpoints.
The default value of "enable_prometheus_etcd_integration" is set to
the combined values of "enable_prometheus" and "enable_etcd".
Change-Id: I7a0b802c5687e2d508e06baf55e355d9761e806f
Without this configuration, all mount points are reporting the same
utilisation metrics [1]. With the rslave option, all root mounts from
the host are visible in the container, so we can remove the bind mounts
for /proc and /sys.
[1] https://github.com/prometheus/node_exporter#docker
Change-Id: I4087dc81f9d1fa5daa24b9df6daf1f9e1ccd702f
Closes-Bug: #1961438
Add support for deploying the Kolla Prometheus libvirt exporter image to
facilitate gathering metrics from the Nova libvirt service.
Co-Authored-by: Dr. Jens Harbott <harbott@osism.tech>
Change-Id: Ib27e60c39297b86ae674297370f9543ab08cda05
Partially-Implements: blueprint libvirt-exporter
Kolla has removed the Volume V2 API by default since OpenStack Wallaby.
However, openstack-exporter attempts to use the Volume V2 API by
default, resulting in clean installs failing to fetch Cinder metrics
in Prometheus.
This patch updates the clouds.yml configuration file for
openstack-exporter to use the Volume V3 API instead.
Closes-Bug: #1938194
Change-Id: Ifbb601be3ef1a1e853d5a7e832adf556c0ae38b9
Role vars have a higher precedence than role defaults. This allows to
import default vars from another role via vars_files without overriding
project_name (see related bug for details).
Change-Id: I3d919736e53d6f3e1a70d1267cf42c8d2c0ad221
Related-Bug: #1951785
This reverts commit 4ff65b7661ea06e9fa8631c4eb82232e03af77d7.
Reason for revert: adds assumptions about inventory_hostname being resolvable.
Closes-Bug: #1955563
Change-Id: Ifa2b2ea8622f56c34b8f7f37fee53133272ff925
"BINLOG MONITOR" and "SLAVE MONITOR" replace
"REPLICATION CLIENT" (which is now an alias for "BINLOG MONITOR").
The validation in Ansible MySQL collection is too simple to
understand aliases and breaks. Hence, let's use the canonical
names and adapt per service according to its needs.
Change-Id: I1175e4846384accd19942620dc155d0c5728e64b
We get a nice optimisation by using a filtered loop instead
of task skipping per service with 'when'.
Partially-Implements: blueprint performance-improvements
Change-Id: I8f68100870ab90cb2d6b68a66a4c97df9ea4ff52
This patch adds support for integrating Prometheus with Fluentd.
This can be used to extract useful information about the status
of Fluentd, such as output buffer capacity and logging rate,
and also to extract metrics from logs via custom Fluentd
configuration. More information can be found here in [1].
[1] https://docs.fluentd.org/monitoring-fluentd/monitoring-prometheus
Change-Id: I233d6dd744848ef1f1589a462dbf272ed0f3aaae
This allows checking of TLS servers. It can be useful to check
RabbitMQ TLS, including certificate expiry.
Change-Id: I2192d3481d790c11b110bf10082b3efeade75463
Adds support for passing extra runtime options to cAdvisor.
By default new options disable exporting rarely useful metrics
and labels by cAdvisor. This helps reducing the load on Prometheus
and cAdvisor itself.
Change-Id: I81f3845d6cd03a70a0c8569f8d0ea421027df083
By default, Ansible injects a variable for every fact, prefixed with
ansible_. This can result in a large number of variables for each host,
which at scale can incur a performance penalty. Ansible provides a
configuration option [0] that can be set to False to prevent this
injection of facts. In this case, facts should be referenced via
ansible_facts.<fact>.
This change updates all references to Ansible facts within Kolla Ansible
from using individual fact variables to using the items in the
ansible_facts dictionary. This allows users to disable fact variable
injection in their Ansible configuration, which may provide some
performance improvement.
This change disables fact variable injection in the ansible
configuration used in CI, to catch any attempts to use the injected
variables.
[0] https://docs.ansible.com/ansible/latest/reference_appendices/config.html#inject-facts-as-vars
Change-Id: I7e9d5c9b8b9164d4aee3abb4e37c8f28d98ff5d1
Partially-Implements: blueprint performance-improvements
This reverts commit c6259158e3eff4aff9770b7044b0179a7de533aa.
Reason for revert: cAdvisor fails with:
invalid value "percpu,referenced_memory,cpu_topology,resctrl,udp,advtcp,sched,hugetlb,memory_numa,tcp,process" for flag -disable_metrics: unsupported metric "referenced_memory" specified in disable_metrics
Change-Id: I1a0eea5c20f95f38c707401b56b7d2454484377d
Adds support for passing extra runtime options to cAdvisor.
By default new options disable exporting rarely useful metrics
and labels by cAdvisor. This helps reducing the load on Prometheus
and cAdvisor itself.
Change-Id: Id0144e8fa518e3236cb94ba2e3961fb455d36443
The rabbitmq_prometheus plugin is available in RabbitMQ 3.8.
https://www.rabbitmq.com/prometheus.html
Implements: blueprint rabbitmq-prometheus
Co-Authored-By: Mark Goddard <mark@stackhpc.com>
Change-Id: I4d69a93a6c70db8d40626042cdbe773747b238ae
Deprecates support for Prometheus v1.x.
In Xena support for it will be removed from Kolla Ansible.
Change-Id: I027b19621196c698e09f79af294ba1b5dbfc0516
It is now possible to deploy either 1.x or 2.x version of Prometheus.
The new 2.x version introduces breaking changes in terms of storage
format and command line options.
Change-Id: I80cc6f1947f3740ef04b29839bfa655b14fae146
Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
This reverts commit 9cae59be51e8d2d798830042a5fd448a4aa5e7dc.
Reason for revert: This patch was found to introduce issues with fluentd customisation. The underlying issue is not currently fully understood, but could be a sign of other obscure issues.
Change-Id: Ia4859c23d85699621a3b734d6cedb70225576dfc
Closes-Bug: #1906288
Add scrape_timeout option in
prometheus_openstack_exporter job in order
to avoid timeout for large Openstack environment.
Change-Id: If96034e602bee3b3eea34a2656047355e1d17eec
Closes-Bug: #1903547
Main plays are action-redirect-stubs, ideal for import_tasks.
This avoids 'include' penalty and makes logs/ara look nicer.
Fixes haproxy and rabbitmq not to check the host group as well.
Change-Id: I46136fc40b815e341befff80b54a91ef431eabc0
Partially-Implements: blueprint performance-improvements
Config plays do not need to check containers. This avoids skipping
tasks during the genconfig action.
Ironic and Glance rolling upgrades are handled specially.
Swift and Bifrost do not use the handlers at all.
Partially-Implements: blueprint performance-improvements
Change-Id: I140bf71d62e8f0932c96270d1f08940a5ba4542a
The Prometheus OpenStack exporter was needlessly configured to use the
prometheus Docker volume and change permissions of /data, which does
not exist in the container image.
This must have been copy-pasted from existing Prometheus code.
Change-Id: I96017c17e68ca7a00a2d5ac41f2f43ef87694514
Including tasks has a performance penalty when compared with importing
tasks. If the include has a condition associated with it, then the
overhead of the include may be lower than the overhead of skipping all
imported tasks. For unconditionally included tasks, switching to
import_tasks provides a clear benefit.
Benchmarking of include vs. import is available at [1].
This change switches from include_tasks to import_tasks where there is
no condition applied to the include.
[1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/include-and-import.md#task-include-and-import
Partially-Implements: blueprint performance-improvements
Change-Id: Ia45af4a198e422773d9f009c7f7b2e32ce9e3b97
The goal for this push request is to normalize the construction and use
of internal, external, and admin URLs. While extending Kolla-ansible
to enable a more flexible method to manage external URLs, we noticed
that the same URL was constructed multiple times in different parts
of the code. This can make it difficult for people that want to work
with these URLs and create inconsistencies in a large code base with
time. Therefore, we are proposing here the use of
"single Kolla-ansible variable" per endpoint URL, which facilitates
for people that are interested in overriding/extending these URLs.
As an example, we extended Kolla-ansible to facilitate the "override"
of public (external) URLs with the following standard
"<component/serviceName>.<companyBaseUrl>".
Therefore, the "NAT/redirect" in the SSL termination system (HAproxy,
HTTPD or some other) is done via the service name, and not by the port.
This allows operators to easily and automatically create more friendly
URL names. To develop this feature, we first applied this patch that
we are sending now to the community. We did that to reduce the surface
of changes in Kolla-ansible.
Another example is the integration of Kolla-ansible and Consul, which
we also implemented internally, and also requires URLs changes.
Therefore, this PR is essential to reduce code duplicity, and to
facility users/developers to work/customize the services URLs.
Change-Id: I73d483e01476e779a5155b2e18dd5ea25f514e93
Signed-off-by: Rafael Weingärtner <rafael@apache.org>