1956 Commits

Author SHA1 Message Date
Michal Nasiadka
b04486df07 Bump ansible-core versions to 2.15 and 2.16
Change-Id: Iab40eb92c7e4a9092471bef9d4477a4fa34f1c85
2024-03-14 06:13:38 +00:00
Michal Arbet
8c760d38a0 Fix creation of ovs bridges
This patch fixes the creation of the openvswitch
bridge by fixing an ansible task that was rewritten
to use an ansible module, but unfortunately, its loop
was implemented incorrectly.

Closes-Bug: #2056332
Change-Id: Ia55a36c0f9b122b72d757ca973e7d8f76ae84344
2024-03-11 09:49:51 +01:00
Michal Arbet
59da07920b Fix coordination when redis used
Tooz 6.0.1 includes commit [1], which introduced
parsing the username from the Redis connection URL.
As a result, services started authenticating as admin
which, by the way, was incorrect even before, as either
a created user or the default one should have been used.

The reason it worked before is simply because the username
'admin' wasn't parsed anywhere.

This patch fixes the user being used and sets the correct
'default' one.

[1] https://review.opendev.org/c/openstack/tooz/+/907656

Closes-Bug: #2056667
Depends-On: https://review.opendev.org/c/openstack/kolla/+/911703
Change-Id: I5568dba15fa98e009ad4a9e41756aba0fa659371
2024-03-11 09:49:01 +01:00
Zuul
5169e3bcbe Merge "Fix typo in release note" 2024-03-07 13:52:12 +00:00
Zuul
a7dd2425ec Merge "prometheus: Add friendly instance labels for ironic and alertmanager" 2024-03-06 12:27:58 +00:00
Pierre Riteau
6ac502ec20 Fix typo in release note
Change-Id: I2f6cd19b7f4d3954bf9de17e6095d39545fe05d3
2024-03-01 09:30:12 +01:00
Michal Nasiadka
add8351834 Missing reno for Ic121bf9f90c9865cd4d08890c80247570ef310ae
Folowup for missing release note, see [1].

[1]: https://review.opendev.org/q/Ic121bf9f90c9865cd4d08890c80247570ef310ae

Change-Id: Ia65e4e28d8a8dfdf439adbdd5a2516b6c064109a
2024-03-01 09:11:59 +01:00
Will Szumski
4d40c9e68f Adds feature flag for ironic-inspector in bifrost
This is useful for backwards compatability.

Depends-On: https://review.opendev.org/c/openstack/kolla/+/909865
Change-Id: Ib2936580db5e7ab3479722bc353c39063010b5f2
2024-02-28 14:59:29 +00:00
Mark Goddard
10f0e9ddef prometheus: Add friendly instance labels for ironic and alertmanager
These were omitted from I387c9d8f5c01baf6054381834ecf4e554d0fff35 and
I387c9d8f5c01baf6054381834ecf4e554d0fff35.

Closes-Bug: #2041855
Change-Id: I25e5450d1caeebd9c900c190fc0079988f1ca574
2024-02-28 12:16:32 +00:00
Zuul
e513ddd982 Merge "Adjust Ceph metrics scrape interval in Prometheus" 2024-02-27 11:59:32 +00:00
Zuul
d30fb56c2a Merge "Remove the grafana volume" 2024-02-20 17:25:50 +00:00
Zuul
33129b7554 Merge "Add service role to ironic service users" 2024-02-19 12:40:03 +00:00
Zuul
a6fa564499 Merge "Ironic: enable elevated access for project scoped service role" 2024-02-19 12:40:00 +00:00
Zuul
0dac9eb93d Merge "Fix mariadb role when used with check mode" 2024-02-15 14:13:18 +00:00
Bartosz Bezak
600e912400 Add service role to ironic service users
Add the service role to ironic service users. Ironic recently enforced
new policy validation as part of the RBAC efforts. [1][2]
Service user support was also added to Ironic. [3]
Admin role needs to stay as not all services added service role support. [4][5]

[1] https://review.opendev.org/c/openstack/ironic/+/902009
[2] e2a47de10a/goals/selected/consistent-and-secure-rbac.rst (phase-2)
[3] https://review.opendev.org/c/openstack/ironic/+/907148
[4] https://review.opendev.org/q/topic:bp%252Fpolicy-service-role-default
[5] https://review.opendev.org/q/topic:%22New-Location-Apis%22

Related-Bug: #2051837
Change-Id: I048402c2247188cf57f35437f557f84ac25d4ff2
2024-02-15 14:05:52 +00:00
Bartosz Bezak
121aa3d258 Ironic: enable elevated access for project scoped service role
Ironic recently started to enforce new policies and scope [1].
And Ironic is one of the sole openstack project which need
system scope for some admin related api calls [2].
However Ironic also started to allow project-scope behaviour
for service role with setting
``rbac_service_role_elevated_access``[3] [4]. This change enables
this setting to get similar behaviour of service role as other
openstack projects.

[1] https://review.opendev.org/c/openstack/ironic/+/902009
[2] e2a47de10a/goals/selected/consistent-and-secure-rbac.rst (L261)
[3] https://review.opendev.org/c/openstack/ironic/+/907148
[4] 8ec5606622/releasenotes/notes/service-project-service-role-fix-e4d1a8c23856926a.yaml

Related-Bug: #2051837

Change-Id: If8d7cf1663145d0398a2e936486e2b316d4df5e0
2024-02-15 15:04:06 +01:00
Dawud
8962b4081e
Remove the grafana volume
Fixes not being able to add additional plugins at build time due to the
`grafana` volume being mounted over the existing `/var/lib/grafana`
directory. This is fixed by copying the dashboards into the container
from an existing bind mount instead of using the ``grafana`` volume.
This however leaves behind the volume which should be removed with
`docker volume rm grafana` or by setting `grafana_remove_old_volume` to
`True`.

Closes-Bug: #2039498
Change-Id: Ibcffa5d8922c470f655f447558d4a9c73b1ba361
2024-02-12 16:03:19 +00:00
Zuul
35352a6be0 Merge "Rework horizon role to support local_settings.d" 2024-02-08 20:45:20 +00:00
Michal Arbet
b5aa63dee1 Rework horizon role to support local_settings.d
This patch implements horizon's preferred way how
to configure itself described in docs [1],

[1] https://docs.openstack.org/horizon/latest/configuration/settings.html

Depends-On: https://review.opendev.org/c/openstack/kolla/+/906339
Change-Id: I60ab4634bf4333c47d00b12fc4ec00570062bd18
2024-02-07 16:13:26 +01:00
Zuul
074d8b0ebf Merge "Enable HAProxy Prometheus metrics endpoint" 2024-02-07 10:33:24 +00:00
Zuul
53f2c582d9 Merge "Update keystone service user passwords" 2024-02-07 10:07:30 +00:00
Michal Arbet
d0b93a631d Fix mariadb role when used with check mode
This patch adds check_mode: false to tasks
in restart_services.yml which just checking
some WSREP status and if port is UP.

Closes-Bug: #2052501
Change-Id: I92a591900d85138a87991a18dd4339efd053ef1b
2024-02-06 10:39:34 +01:00
de6878a819 reno: Update master for unmaintained/yoga
Update the yoga release notes configuration to build from
unmaintained/yoga.

Change-Id: I3ebb137938de8f9333c89173974656712e89c17f
2024-02-05 16:06:51 +00:00
Zuul
50ad7c6681 Merge "Configure missing nova services to expose vendordata over configdrive" 2024-02-02 11:42:14 +00:00
Grzegorz Koper
0376f9dd8d Configure missing nova services to expose vendordata over configdrive
Closes-Bug: #2049607

Change-Id: I14ae2be2e19ad06e3190e2e948bac7ce77e80d4b
2024-01-30 14:47:14 +01:00
Michal Arbet
6f847610b5 Fix neutron DNS integration
This patch basically does a simple thing, on the basis
of a variable neutron_dns_integration it enables/disables
DNS integration.

There is also precheck added which checks whether dns_domain
in neutron.conf has a non-default value if DNS integration is
enabled as this is requirement.

[1] https://docs.openstack.org/neutron/latest/admin/config-dns-int.html
[2] https://docs.openstack.org/neutron/latest/admin/config-dns-int-ext-serv.html#config-dns-int-ext-serv

Closes-Bug: #2049503

Change-Id: I90f0f8dcec6fa0112179f050d96e9d9db5956cf8
2024-01-30 09:56:45 +01:00
Michal Arbet
66c4f72c50 Enable instance usage audit only when ceilometer is enabled
This patch disables periodic compute.instance.exists
notifications when designate is enabled.

Related-Bug: #2049503
Change-Id: I39fe2db9182de23c1df814d911eec15e86317702
2024-01-30 09:48:35 +01:00
Alex-Welsh
ffd6e3bf32 Update keystone service user passwords
Service user passwords will now be updated in keystone if services are
reconfigured with new passwords set in config. This behaviour can be
overridden.

Closes-Bug: #2045990
Change-Id: I91671dda2242255e789b521d19348b0cccec266f
2024-01-29 15:05:09 +00:00
Piotr Parczewski
03a1b9925d Adjust Ceph metrics scrape interval in Prometheus
Enables modifying the interval and sets the recommended default value.

[1] https://docs.ceph.com/en/latest/mgr/prometheus/#configuration

Change-Id: I4b91d184485aa52b3c06011f9dbb6b34bcad3ca8
2024-01-17 21:40:19 +01:00
Matt Crees
e502b65ba1 Fix OpenSearch upgrade tasks idempotency
Shard allocation is disabled at the start of the OpenSearch upgrade
task. This is set as a transient setting, meaning it will be removed
once the containers are restarted. However, if there is not change in
the OpenSearch container it will not be restarted so the cluster is left
in a broken state: unable to allocate shards.

This patch moves the pre-upgrade tasks to within the handlers, so shard
allocation and the flush are only performed when the OpenSearch
container is going to be restarted.

Closes-Bug: #2049512
Change-Id: Ia03ba23bfbde7d50a88dc16e4f117dec3c98a448
2024-01-17 10:57:52 +00:00
Zuul
3ed60961bb Merge "Fix trove failed to discover swift endpoint" 2024-01-12 11:41:01 +00:00
Zuul
781e3949f4 Merge "Fix trove failed to connect rabbitmq - durable queues support" 2024-01-11 14:13:45 +00:00
wu.chunyang
9eff43809f Fix trove failed to discover swift endpoint
This change fixes the trove failed to discover swift endpoint
by adding service_credentials in guest-agent.conf

Closes-Bug: #2048829

Change-Id: I185484d2a0d0a2d4016df6acf8a6b0a7f934c237
2024-01-11 10:15:12 +00:00
wu.chunyang
6b96d098bf Fix trove failed to connect rabbitmq - durable queues support
This change fixes the trove guest instance failed to connect to
RabbitMQ by adding durable queues support to oslo_messaging_rabbit
section in guest-agent.conf.

Partial-Bug: #2048822

Change-Id: I8efc3c92e861816385e6cda3b231a950a06bf57d
2024-01-11 10:11:29 +00:00
Zuul
357db52433 Merge "Enable the Fluentd Plugin Systemd" 2024-01-10 16:00:36 +00:00
Zuul
c78cedfa75 Merge "Fix Nova scp failures on Debian Bookworm" 2024-01-09 13:53:33 +00:00
Pierre Riteau
bfa9dd97a9 Fix Nova scp failures on Debian Bookworm
The addition of an instance resize operation [1] to CI testing is
triggering a failure in kolla-ansible-debian-ovn jobs, which are using a
nodeset with multiple nodes:

    oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
    Command: scp -r /var/lib/nova/instances/8ca2c7e8-acae-404c-af7d-6cac38e354b8_resize/disk 192.0.2.2:/var/lib/nova/instances/8ca2c7e8-acae-404c-af7d-6cac38e354b8/disk
    Exit code: 255
    Stdout: ''
    Stderr: "Warning: Permanently added '[192.0.2.2]:8022' (ED25519) to the list of known hosts.\r\nsubsystem request failed on channel 0\r\nscp: Connection closed\r\n"

This is not seen on Ubuntu Jammy, which uses OpenSSH 8.9, while Debian
Bookworm uses OpenSSH 9.2. This is likely related to this change in
OpenSSH 9.0 [2]:

    This release switches scp(1) from using the legacy scp/rcp protocol
    to using the SFTP protocol by default.

Configure sftp subsystem like on RHEL9 derivatives. Even though it is
not yet required for Ubuntu, we also configure it so we are ready for
the Noble release.

[1] https://review.opendev.org/c/openstack/kolla-ansible/+/904249
[2] https://www.openssh.com/txt/release-9.0

Closes-Bug: #2048700
Change-Id: I9f1129136d7664d5cc3b57ae5f7e8d05c499a2a5
2024-01-08 23:12:38 +01:00
Michal Arbet
9ecfcf5a17 Enable glance proxying behaviour
This patch sets URL to glance worker.
If this is set, other glance workers will know how to contact this one
directly if needed. For image import, a single worker stages the image
and other workers need to be able to proxy the import request to the
right one.

With current setup glance image import just not working.

Closes-Bug: #2048525

Change-Id: I4246dc8a80038358cd5b6e44e991b3e2ed72be0e
2024-01-08 16:30:29 +01:00
Zuul
205fd639b8 Merge "cadvisor: Set housekeeping interval to Prometheus scrape interval" 2024-01-06 08:53:43 +00:00
Mark Goddard
97e5c0e9b1 cadvisor: Set housekeeping interval to Prometheus scrape interval
The prometheus_cadvisor container has high CPU usage. On various
production systems I checked it sits around 13-16% on controllers,
averaged over the prometheus 1m scrape interval. When viewed with top we
can see it is a bit spikey and can jump over 100%.

There are various bugs about this, but I found
https://github.com/google/cadvisor/issues/2523 which suggests reducing
the per-container housekeeping interval. This defaults to 1s, which
provides far greater granularity than we need with the default
prometheus scrape interval of 60s.

Reducing the housekeeping interval to 60s on a production controller
reduced the CPU usage from 13% to 3.5% average. This still seems high,
but is more reasonable.

Change-Id: I89c62a45b1f358aafadcc0317ce882f4609543e7
Closes-Bug: #2048223
2024-01-05 11:02:41 +00:00
Dawud
140722f74e
Enable HAProxy Prometheus metrics endpoint
HAProxy exposes a Prometheus metrics endpoint, it just needs to be
enabled. Enable this and remove configuration for
prometheus-haproxy-exporter. Remaining prometheus-haproxy-exporter
containers will automatically be removed.

Change-Id: If6e75691d2a996b06a9b95cb0aae772db54389fb
Co-Authored-By: Matt Anson <matta@stackhpc.com>
2024-01-05 10:36:31 +00:00
Michal Arbet
b1fd2b40f7 Fix long service restarts while using systemd
Some containers exiting with 143 instead of 0, but
this is still OK. This patch just allows
ExitCode 143 (SIGTERM) as fix. Details in
bugreport.

Services which exited with 143 (SIGTERM):

kolla-cron-container.service
kolla-designate_producer-container.service
kolla-keystone_fernet-container.service
kolla-letsencrypt_lego-container.service
kolla-magnum_api-container.service
kolla-mariadb_clustercheck-container.service
kolla-neutron_l3_agent-container.service
kolla-openvswitch_db-container.service
kolla-openvswitch_vswitchd-container.service
kolla-proxysql-container.service

Partial-Bug: #2048130
Change-Id: Ia8c85d03404cfb368e4013066c67acd2a2f68deb
2024-01-05 10:06:56 +01:00
Zuul
3681427b31 Merge "Persist Neutron agent state files in volume" 2024-01-03 09:51:57 +00:00
Zuul
65886c1d4e Merge "Fix wsrep sync status task while switched to TCP/IP" 2024-01-02 14:07:22 +00:00
Zuul
00fc2f85b3 Merge "Set a log retention policy for OpenSearch" 2023-12-21 15:17:32 +00:00
Doug Szumski
5e5a2dca09 Set a log retention policy for OpenSearch
We previously used ElasticSearch Curator for managing log
retention. Now that we have moved to OpenSearch, we can use
the Index State Management (ISM) plugin which is bundled with
OpenSearch.

This change adds support for automating the configuration of
the ISM plugin via the OpenSearch API. By default, it has
similar behaviour to the previous ElasticSearch Curator
default policy.

Closes-Bug: #2047037

Change-Id: I5c6d938f2bc380f1575ee4f16fe17c6dca37dcba
2023-12-21 10:51:17 +01:00
Zuul
17a76d2a0e Merge "Add precheck for RabbitMQ quorum queues" 2023-12-14 14:54:40 +00:00
Pierre Riteau
693c5c8b23 Fix Docker health check for sahara_engine
The wrong process name was being used.

Closes-Bug: #2046268
Change-Id: I5a5d4f227205e811732331ee6e020ccea67b6fab
2023-12-14 09:53:03 +00:00
Zuul
c0cddb0967 Merge "Configures the tap-as-a-service neutron plugin" 2023-12-13 16:11:36 +00:00
Matt Crees
61f84e3beb Add precheck for RabbitMQ quorum queues
Adds a precheck to fail if non-quorum queues are found in RabbitMQ.

Currently excludes fanout and reply queues, pending support in
oslo.messaging [1].

[1]: https://review.opendev.org/c/openstack/oslo.messaging/+/888479

Closes-Bug: #2045887
Change-Id: Ibafdcd58618d97251a3405ef9332022d4d930e2b
2023-12-13 14:49:05 +00:00