136 Commits

Author SHA1 Message Date
Zuul
7050ce0210 Merge "Fix redundant extra config files in grafana role" 2024-08-22 11:36:10 +00:00
Zuul
99ffff3551 Merge "Add support for docker_image_name_prefix" 2024-08-20 13:37:50 +00:00
Ivan Halomi
4ce47e2250 Refactor of kolla_container_facts
Refactor that prepares kolla_container_facts
module for introducing more actions that will be moved
from kolla_container module and kolla_container_volume_facts.

This change is based on a discussion about adding a new action
to kolla_container module that retrieves all names of the running
containers. It was agreed that kolla-ansible should follow Ansible's
direction of splitting modules between action modules and facts
modules. Because of this, kolla_container_facts needs to be able
to handle different requests for data about containers or volumes.

Change-Id: Ieaec8f64922e4e5a2199db2d6983518b124cb4aa
Signed-off-by: Ivan Halomi <ivan.halomi@tietoevry.com>
2024-08-12 09:54:05 +02:00
Michal Arbet
ae86e3a0db Add support for docker_image_name_prefix
The Kolla project supports building images with
user-defined prefixes. However, Kolla-ansible is unable
to use those images for installation.

This patch fixes that issue.

Closes-Bug: #2073541
Change-Id: Ia8140b289aa76fcd584e0e72686e3786215c5a99
2024-07-19 08:10:45 +02:00
Roman Krček
fb3a8f5fa9 Performance: use filters for service dicts
Most roles are not leveraging the jinja filters available.
According to [1] filtering the list of services makes the execution
faster than skipping the tasks.

This patchset also includes some cosmetic changes to genconfig.
Individual services are now also using a jinja filter. This has
no impact on performance, just makes the tasks look cleaner.

Naming of some vars in genconfig was changed to "service" to make
the tasks more uniform as some were previously using
the service name and some were using "service".

Three metrics from the deployment were taken and those were
- overall deployment time [s]
- time spent on the specific role [s]
- CPU usage (measured with perf) [-]
Overall genconfig time went down on avg. from 209s to 195s
Time spent on the loadbalancer role went down on avg. from 27s to 23s
Time spent on the neutron role went down on avg from 102s to 95s
Time spent on the nova-cell role went down on avg. from 54s to 52s
Also the average CPUs utilized reported by perf went down
from 3.31 to 3.15.
For details of how this was measured see the comments in gerrit.

[1] - https://github.com/stackhpc/ansible-scaling/blob/master/doc/skip.md

Change-Id: Ib0f00aadb6c7022de6e8b455ac4b9b8cd6be5b1b
Signed-off-by: Roman Krček <roman.krcek@tietoevry.com>
2024-06-28 09:04:43 +02:00
Jan Horstmann
4178f02e2b
Fix redundant extra config files in grafana role
Task `Check if extra configuration file exists` picks up all files in
`{{ node_custom_config }}/grafana` including those that get handled
specially later on.
While `prometheus.yml` and `provisioning.yml` are best excluded from
extra config , because their treatment requires more than just copying,
`grafana_home_dashboard.json` may simply be treated as extra config,
which saves the execution of two additional tasks.

Closes-Bug: 2067999

Change-Id: I7bce1fe3d0a96816f1782107b202d6dac7d1291d
Signed-off-by: Jan Horstmann <horstmann@osism.tech>
2024-06-10 15:01:32 +02:00
Dawud
7102c9cc9c Configure log level field for the Grafana OpenSearch datasource
Change-Id: Ic38469661323fbce438c70bd1b9301e9f7db8030
2024-03-18 21:10:59 +00:00
Dawud
9afc9da226
Update Grafana OpenSearch datasource configuration
Updates the Grafana OpenSearch datasource configuration to use values
for OpenSearch that work out of the box.

Closes-Bug: #2039500
Change-Id: Id9c7e59e6ae1dd98176c68b14a2aff1985306751
2024-03-15 16:04:45 +00:00
Zuul
d30fb56c2a Merge "Remove the grafana volume" 2024-02-20 17:25:50 +00:00
Dawud
8962b4081e
Remove the grafana volume
Fixes not being able to add additional plugins at build time due to the
`grafana` volume being mounted over the existing `/var/lib/grafana`
directory. This is fixed by copying the dashboards into the container
from an existing bind mount instead of using the ``grafana`` volume.
This however leaves behind the volume which should be removed with
`docker volume rm grafana` or by setting `grafana_remove_old_volume` to
`True`.

Closes-Bug: #2039498
Change-Id: Ibcffa5d8922c470f655f447558d4a9c73b1ba361
2024-02-12 16:03:19 +00:00
Zuul
db79eb0a55 Merge "Rename kolla_docker to kolla_container" 2023-11-28 12:06:09 +00:00
Will Szumski
dfce510c0f Fix grafana prometheus datasource
See:
https://grafana.com/docs/grafana/latest/administration/provisioning/

Closes-Bug: #2043828
Change-Id: I9ed07dc8c995adddf6d89838cd515af93d10bd00
2023-11-17 18:10:04 +00:00
Martin Hiner
a13d83400f Rename kolla_docker to kolla_container
Changes name of ansible module kolla_docker to
kolla_container.

Change-Id: I13c676ed0378aa721a21a1300f6054658ad12bc7
Signed-off-by: Martin Hiner <m.hiner@partner.samsung.com>
2023-11-15 13:54:57 +01:00
Will Szumski
37c2ab2aaa Support exposing prometheus_server externally
This avoids the need to use a proxy, or some other means, to connect to
Prometheus. This is disabled by default and can be enabled by setting
enable_prometheus_server_external to true.

Change-Id: Ia0af044ff436c2a204b357750a16ff49fcdfec45
2023-11-07 14:52:06 +00:00
Michal Nasiadka
4bc410c6ca haproxy: support single external frontend
Use case: exposing single external https frontend and
load balancing services using FQDNs.

Support different ports for internal and external endpoints.

Introduced kolla_url filter to normalize urls like:
- https://magnum.external:443/v1
- http://magnum.external:80/v1

Change-Id: I9fb03fe1cebce5c7198d523e015280c69f139cd0
Co-Authored-By: Jakub Darmach <jakub@stackhpc.com>
2023-06-29 01:44:00 +02:00
Mark Goddard
46aeb9843f Fix prechecks in check mode
When running in check mode, some prechecks previously failed because
they use the command module which is silently not run in check mode.
Other prechecks were not running correctly in check mode due to e.g.
looking for a string in empty command output or not querying which
containers are running.

This change fixes these issues.

Closes-Bug: #2002657
Change-Id: I5219cb42c48d5444943a2d48106dc338aa08fa7c
2023-01-12 14:27:36 +00:00
Matt Crees
6c2aace8d6 Integrate oslo-config-validator
Regularly, we experience issues in Kolla Ansible deployments because we
use wrong options in OpenStack configuration files. This is because
OpenStack services ignore unknown options. We also need to keep on top
of deprecated options that may be removed in the future. Integrating
oslo-config-validator into Kolla Ansible will greatly help.

Adds a shared role to run oslo-config-validator on each service. Takes
into account that services have multiple containers, and these may also
use multiple config files. Service roles are extended to use this shared
role. Executed with the new command ``kolla-ansible validate-config``.

Change-Id: Ic10b410fc115646d96d2ce39d9618e7c46cb3fbc
2022-12-21 17:19:09 +00:00
Michal Nasiadka
e1ec02eddf Replace ElasticSearch and Kibana with OpenSearch
This change replaces ElasticSearch with OpenSearch, and Kibana
with OpenSearch Dashboards. It migrates the data from ElasticSearch
to OpenSearch upon upgrade.

No TLS support is in this patch (will be a followup).

A replacement for ElasticSearch Curator will be added as a followup.

Depends-On: https://review.opendev.org/c/openstack/kolla/+/830373

Co-authored-by: Doug Szumski <doug@stackhpc.com>
Co-authored-by: Kyle Dean <kyle@stackhpc.com>
Change-Id: Iab10ce7ea5d5f21a40b1f99b28e3290b7e9ce895
2022-12-01 10:27:50 +00:00
Doug Szumski
adb8f89a36 Remove support for deploying OpenStack Monasca
Kolla Ansible is switching to OpenSearch and is dropping support for
deploying ElasticSearch. This is because the final OSS release of
ElasticSearch has exceeded its end of life.

Monasca is affected because it uses both Logstash and ElasticSearch.
Whilst it may continue to work with OpenSearch, Logstash remains an
issue.

In the absence of any renewed interest in the project, we remove
support for deploying it. This helps to reduce the complexity
of log processing configuration in Kolla Ansible, freeing up
development time.

Change-Id: I6fc7842bcda18e417a3fd21c11e28979a470f1cf
2022-11-11 15:48:11 +00:00
Ivan Halomi
4ca2d41762 Adding container_engine to kolla_toolbox module
Second part of patchset:
https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
in which was suggested to split patch into smaller ones.

THis change adds container_engine to module parameters
so when we introduce podman, kolla_toolbox can be used
for both engines.

Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Co-authored-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: Ic2093aa9341a0cb36df8f340cf290d62437504ad
2022-11-04 15:32:30 +01:00
Ivan Halomi
7a9f04573a Adding container engine to kolla_container_facts
Second part of patchset:
https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
in which was suggested to split patch into smaller ones.

This change adds container_engine variable to kolla_container_facts
module, this prepares module to be used with docker and podman as well
without further changes in roles.

Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Co-authored-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: I9e8fa30646844ab4a288555f3aafdda345b3a118
2022-11-02 13:44:45 +01:00
Michal Arbet
4838591c6c Add loadbalancer-config role and wrap haproxy-config role inside
This patch adds loadbalancer-config role
which is "wrapper" around haproxy-config
and proxysql-config role which will be added
in follow-up patches.

Change-Id: I64d41507317081e1860a94b9481a85c8d400797d
2022-08-09 12:15:49 +02:00
Michal Arbet
baad47ac61 Edit services roles to support database sharding
Depends-On: https://review.opendev.org/c/openstack/kolla/+/769385
Depends-On: https://review.opendev.org/c/openstack/kolla/+/765781

Change-Id: I3c4182a6556dafd2c936eaab109a068674058fca
2022-08-09 12:15:26 +02:00
Michal Nasiadka
dcf5a8b65f Fix var-spacing
ansible-lint introduced var-spacing - let's fix our code.

Change-Id: I0d8aaf3c522a5a6a5495032f6dbed8a2be0251f0
2022-07-25 22:15:15 +02:00
Pierre Riteau
13b0f3b861 Make external access to monitoring services configurable
Change-Id: Iaf6bf36ae0adce3342981c36c859fc138b172f6b
2022-06-27 11:57:53 +02:00
T0125936 - LALLAU Bertrand
13af278708 Fix typo in endpoint influxdb_internal_endpoint variable
This patch simply fix a typo in 'influxdb_internal_endpoint' variable.

Change-Id: I1b1068e84be7f7eaff1a4eab1ba9ddcd6f4241c7
2022-06-08 11:31:38 +02:00
Radosław Piliszek
3e75a33ad4 Use the new image naming scheme
Change-Id: Ib4b15ed4feac82d8492b1c0f0238a752eac668e6
2022-05-23 06:37:25 +00:00
Pierre Riteau
555cd39f1a Fix typos in docs
This is a follow up to I7e5c1e20c7b66b64cbd333f669ef8d8da60daaa8.

Change-Id: I11a86f59c1fb9cddde3370b544ee7bf4e8ae4fb4
2022-05-02 15:44:34 +02:00
Zuul
2c15d36fed Merge "Adds prometheus_scrape_interval" 2022-04-21 16:55:35 +00:00
Marcin Juszkiewicz
1620ab5be9 drop install_type from image names
We have only one value for install_type now and it gets removed from
image names.

Change-Id: I8bf95fd7aa9dd26b80d618ca0fcb097003b4cb0a
2022-04-20 12:29:12 +02:00
Jan Horstmann
3d91e69aab Change grafana provisioning.yaml indentation
This commit changes the indentation scheme used in
`ansible/roles/grafana/templates/provisioning.yaml.j2` to the commonly
used pattern of two whitespaces.

Change-Id: I2f9d34930ed06aa2e63f7cc28bfdda7046fc3e67
2022-03-25 09:26:24 +01:00
Pierre Riteau
f37562827d Remove grafana [session] configuration
These configuration settings were removed in Grafana 6.2. Instead we can
use [remote_cache], but it is not required since it will use database
settings by default.

Change-Id: I37966027aea9039b2ecba4214444507e9d87f513
2022-02-22 10:26:37 +01:00
Will Szumski
033db44f1c Adds prometheus_scrape_interval
Grafana requires the scrape interval to be set to be able to compute
$__rate_interval. The default is 15s which does not match the kolla
default of 60s. The symptom of not setting this is that you will see
"no data" when zooming graphs that use rate queries. This occurs as the
interval will be set to a period shorter than the scrape interval.
The recommendation is that you use a common scrape interval for all
jobs. See:

- https://grafana.com/blog/2020/09/28/new-in-grafana-7.2-__rate_interval-for-prometheus-rate-queries-that-just-work/
- https://stackoverflow.com/questions/66369969/set-scrape-interval-in-provisioned-prometheus-data-source-in-grafana

Change-Id: I7e5c1e20c7b66b64cbd333f669ef8d8da60daaa8
2022-02-14 11:10:44 +00:00
Pierre Riteau
56fc74f231 Move project_name and kolla_role_name to role vars
Role vars have a higher precedence than role defaults. This allows to
import default vars from another role via vars_files without overriding
project_name (see related bug for details).

Change-Id: I3d919736e53d6f3e1a70d1267cf42c8d2c0ad221
Related-Bug: #1951785
2021-12-31 09:26:25 +00:00
Dr. Jens Harbott
f8f34e0c47
Bump timeout for grafana startup
The initial migrations when starting grafana for the first time may
sometimes take much longer than 20s, we have seen samples up to near
60s. Allow 120s to have some margin. Also make the timeout parameters
configurable.

Closes-Bug: 1769962
Signed-off-by: Dr. Jens Harbott <harbott@osism.tech>
Change-Id: If9186d8aa65150c492657550064789e211dbb570
2021-12-09 08:05:57 +01:00
Uwe Grawert
82b0e095a5 Grafana: Run priviliged when copying home dashboard file
The copy job for the grafana home dashboard file needs to run
priviliged, otherwise permission denied error occurs.

Closes-Bug: #1947710

Change-Id: Ib15e961e5193af55e45a443305a96667295f3cb7
2021-10-20 11:26:09 +02:00
Radosław Piliszek
9ff2ecb031 Refactor and optimise image pulling
We get a nice optimisation by using a filtered loop instead
of task skipping per service with 'when'.

Partially-Implements: blueprint performance-improvements
Change-Id: I8f68100870ab90cb2d6b68a66a4c97df9ea4ff52
2021-08-10 11:57:54 +00:00
Zuul
6ea8390a12 Merge "Extend support for custom Grafana dashboards" 2021-07-12 16:00:47 +00:00
Mark Goddard
ade5bfa302 Use ansible_facts to reference facts
By default, Ansible injects a variable for every fact, prefixed with
ansible_. This can result in a large number of variables for each host,
which at scale can incur a performance penalty. Ansible provides a
configuration option [0] that can be set to False to prevent this
injection of facts. In this case, facts should be referenced via
ansible_facts.<fact>.

This change updates all references to Ansible facts within Kolla Ansible
from using individual fact variables to using the items in the
ansible_facts dictionary. This allows users to disable fact variable
injection in their Ansible configuration, which may provide some
performance improvement.

This change disables fact variable injection in the ansible
configuration used in CI, to catch any attempts to use the injected
variables.

[0] https://docs.ansible.com/ansible/latest/reference_appendices/config.html#inject-facts-as-vars

Change-Id: I7e9d5c9b8b9164d4aee3abb4e37c8f28d98ff5d1
Partially-Implements: blueprint performance-improvements
2021-06-23 10:38:06 +01:00
Doug Szumski
82cf40edf2 Remove Monasca Grafana service
In the Xena cycle it was decided to remove the Monasca
Grafana fork due to lack of maintenance. This commit removes
the service and provides a limited workaround using the
Monasca Grafana datasource with vanilla Grafana.

Depends-On: I9db7ec2df050fa20317d84f6cea40d1f5fd42e60
Change-Id: I4917ece1951084f6665722ba9a91d47764d3709a
2021-04-27 11:06:25 +00:00
Doug Szumski
d01192c160 Extend support for custom Grafana dashboards
The current behaviour is to support supplying a single
folder of Grafana dashboards which can then be populated
into a single folder in Grafana. Some users may wish
to have sub-folders of Dashboards, and load these into
separate dashboard folders in Grafana via a custom
provisioning file. For example, a user may have a
sub-folder of Ceph dashboards that they wish to keep
separate from OpenStack dashboards. This patch supports
sub-folders whilst not affecting the original mechanism.

Trivial-Fix

Change-Id: I9cd289a1ea79f00cee4d2ef30cbb508ac73f9767
2021-04-19 11:11:43 +01:00
Zuul
2ba4c88c8d Merge "Add support for custom grafana dashboards" 2021-03-17 16:48:48 +00:00
Bartosz Bezak
a9e30382fe Add support for custom grafana dashboards
Allow users to import custom grafana dashboards.
Dashboards as JSON files should be placed into
"{{ node_custom_config }}/grafana/dashboards/" folder.

Change-Id: Id0f83b8d08541b3b74649f097b10c9450201b426
2021-03-16 17:10:19 +01:00
Will Szumski
31f97d6cca Do not wait for grafana to start when kolla_action=config
Prior to this change it was not possible to generate the config
before deploying the services as you'd hit:

RUNNING HANDLER [Waiting for grafana to start on first node] *************************
Monday 18 January 2021  15:06:35 +0000 (0:00:00.182)       0:04:39.213 ********
skipping: [sv-h22a8-u19]
skipping: [sv-h22a5-u36]
FAILED - RETRYING: Waiting for grafana to start on first node (10 retries left).

This would never succeed as the service has not yet been deployed.

TrivialFix
Change-Id: I9437a049b24e5e613a7e66add481a8983b84867a
2021-01-18 15:42:31 +00:00
Mark Goddard
db4fc85c33 Revert "Performance: Use import_tasks in the main plays"
This reverts commit 9cae59be51e8d2d798830042a5fd448a4aa5e7dc.

Reason for revert: This patch was found to introduce issues with fluentd customisation. The underlying issue is not currently fully understood, but could be a sign of other obscure issues.

Change-Id: Ia4859c23d85699621a3b734d6cedb70225576dfc
Closes-Bug: #1906288
2020-12-14 10:36:55 +00:00
Radosław Piliszek
9cae59be51 Performance: Use import_tasks in the main plays
Main plays are action-redirect-stubs, ideal for import_tasks.

This avoids 'include' penalty and makes logs/ara look nicer.

Fixes haproxy and rabbitmq not to check the host group as well.

Change-Id: I46136fc40b815e341befff80b54a91ef431eabc0
Partially-Implements: blueprint performance-improvements
2020-10-27 19:09:32 +01:00
Radosław Piliszek
3411b9e420 Performance: optimize genconfig
Config plays do not need to check containers. This avoids skipping
tasks during the genconfig action.

Ironic and Glance rolling upgrades are handled specially.

Swift and Bifrost do not use the handlers at all.

Partially-Implements: blueprint performance-improvements
Change-Id: I140bf71d62e8f0932c96270d1f08940a5ba4542a
2020-10-12 19:30:06 +02:00
Mark Goddard
b685ac44e0 Performance: replace unconditional include_tasks with import_tasks
Including tasks has a performance penalty when compared with importing
tasks. If the include has a condition associated with it, then the
overhead of the include may be lower than the overhead of skipping all
imported tasks. For unconditionally included tasks, switching to
import_tasks provides a clear benefit.

Benchmarking of include vs. import is available at [1].

This change switches from include_tasks to import_tasks where there is
no condition applied to the include.

[1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/include-and-import.md#task-include-and-import

Partially-Implements: blueprint performance-improvements

Change-Id: Ia45af4a198e422773d9f009c7f7b2e32ce9e3b97
2020-08-28 16:12:03 +00:00
Rafael Weingärtner
f425c0678f Standardize use and construction of endpoint URLs
The goal for this push request is to normalize the construction and use
 of internal, external, and admin URLs. While extending Kolla-ansible
 to enable a more flexible method to manage external URLs, we noticed
 that the same URL was constructed multiple times in different parts
 of the code. This can make it difficult for people that want to work
 with these URLs and create inconsistencies in a large code base with
 time. Therefore, we are proposing here the use of
 "single Kolla-ansible variable" per endpoint URL, which facilitates
 for people that are interested in overriding/extending these URLs.

As an example, we extended Kolla-ansible to facilitate the "override"
of public (external) URLs with the following standard
"<component/serviceName>.<companyBaseUrl>".
Therefore, the "NAT/redirect" in the SSL termination system (HAproxy,
HTTPD or some other) is done via the service name, and not by the port.
This allows operators to easily and automatically create more friendly
 URL names. To develop this feature, we first applied this patch that
 we are sending now to the community. We did that to reduce the surface
  of changes in Kolla-ansible.

Another example is the integration of Kolla-ansible and Consul, which
we also implemented internally, and also requires URLs changes.
Therefore, this PR is essential to reduce code duplicity, and to
facility users/developers to work/customize the services URLs.

Change-Id: I73d483e01476e779a5155b2e18dd5ea25f514e93
Signed-off-by: Rafael Weingärtner <rafael@apache.org>
2020-08-19 07:22:17 +00:00
Mark Goddard
146b00efa7 Mount /etc/timezone based on host OS
Previously we mounted /etc/timezone if the kolla_base_distro is debian
or ubuntu. This would fail prechecks if debian or ubuntu images were
deployed on CentOS. While this is not a supported combination, for
correctness we should fix the condition to reference the host OS rather
than the container OS, since that is where the /etc/timezone file is
located.

Change-Id: Ifc252ae793e6974356fcdca810b373f362d24ba5
Closes-Bug: #1882553
2020-08-10 10:14:18 +01:00