From b308773b2090bbba6a953e46c1a75562b650f286 Mon Sep 17 00:00:00 2001 From: Matt Crees Date: Thu, 13 Apr 2023 09:55:56 +0100 Subject: [PATCH] Add precheck to fail if RabbitMQ HA needs configuring Currently, the process of enabling RabbitMQ HA with the variable ``om_enable_rabbitmq_high_availbility`` requires some manual steps to migrate from transient to mirrored queues. In preparation for setting this variable to ``True`` by default, this adds a precheck that will fail if a system is currently running non-mirrored queues and ``om_enable_rabbitmq_high_availbility`` is set to ``True``. Includes a helpful message informing the operator of their choice. Either follow the manual procedure to migrate the queues described in the docs, or set ``om_enable_rabbitmq_high_availbility`` to ``False``. The RabbitMQ HA section of the reference docs is updated to include these instructions. Change-Id: Ic5e64998bd01923162204f7bb289cc110187feec (cherry picked from commit a5331d3208a0bcb4aada9b65acb71c51353cd9b9) --- ansible/roles/rabbitmq/tasks/precheck.yml | 24 +++++++++++++++++ .../reference/message-queues/rabbitmq.rst | 26 +++++++++++++++++++ 2 files changed, 50 insertions(+) diff --git a/ansible/roles/rabbitmq/tasks/precheck.yml b/ansible/roles/rabbitmq/tasks/precheck.yml index c53b2e7891..06afc4de77 100644 --- a/ansible/roles/rabbitmq/tasks/precheck.yml +++ b/ansible/roles/rabbitmq/tasks/precheck.yml @@ -194,3 +194,27 @@ - enable_outward_rabbitmq | bool - rabbitmq_enable_tls | bool - key | length == 0 + +- block: + - name: List RabbitMQ policies + become: true + command: "docker exec rabbitmq rabbitmqctl list_policies --silent" + register: rabbitmq_policies + changed_when: false + check_mode: false + + - name: Check if RabbitMQ HA needs to be configured + assert: + that: "'ha-all' in rabbitmq_policies.stdout" + fail_msg: > + om_enable_rabbitmq_high_availability is True but no mirroring policy has been found. + Currently the procedure to migrate from transient non-mirrored queues to durable mirrored queues is manual. + Please follow the process described here: https://docs.openstack.org/kolla-ansible/latest/reference/message-queues/rabbitmq.html#high-availability. + Note that this process may take several hours on larger systems, and may cause a degredation in performance at large scale. + If you do not wish to enable this feature, set om_enable_rabbitmq_high_availability to False. + + run_once: true + when: + - container_facts['rabbitmq'] is defined + - om_enable_rabbitmq_high_availability | bool + tags: rabbitmq-ha-precheck diff --git a/doc/source/reference/message-queues/rabbitmq.rst b/doc/source/reference/message-queues/rabbitmq.rst index 793c4b9da1..75c7bac15d 100644 --- a/doc/source/reference/message-queues/rabbitmq.rst +++ b/doc/source/reference/message-queues/rabbitmq.rst @@ -118,3 +118,29 @@ availability. These are durable queues and classic queue mirroring. Setting the flag ``om_enable_rabbitmq_high_availability`` to ``true`` will enable both of these features. There are some queue types which are intentionally not mirrored using the exclusionary pattern ``^(?!(amq\\.)|(.*_fanout_)|(reply_)).*``. + +After enabling this value on a running system, there are some additional steps +needed to migrate from transient to durable queues. + +1. Stop all OpenStack services which use RabbitMQ, so that they will not + attempt to recreate any queues yet. + +2. Reconfigure RabbitMQ to enable classic queue mirroring. + + .. code-block:: console + + kolla-ansible reconfigure --tags rabbitmq --skip-tags rabbitmq-ha-precheck + +3. Reset the state on each RabbitMQ node with the following commands. Each + command must be run on all RabbitMQ nodes before moving on to the next + command. This will remove all queues. + + .. code-block:: console + + rabbitmqctl stop_app + rabbitmqctl force_reset + rabbitmqctl start_app + +4. Reconfigure the OpenStack services using ``kolla-ansible reconfigure``, at + which point they will start again and recreate the appropriate queues as + durable.