oslo.messaging/releasenotes/notes/rabbit_transient_quorum-fc3c3f88ead90034.yaml
Arnaud Morin 989dbb8aad Enable use of quorum queues for transient messages
Add a new flag rabbit_transient_quorum_queue to enable the use of quorum
for transient queues (reply_ and _fanout_)

This is helping a lot OpenStack services to not fail (and recover) from
a rabbit node issue.

Related-bug: #2031497

Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
Change-Id: Icee5ee6938ca7c9651f281fb835708fc88b8464f
2023-11-12 00:08:20 +01:00

32 lines
1.5 KiB
YAML

---
features:
- |
Add an option to enable transient queues to use quorum.
Transient queues in OpenStack are not so transient, they live the whole
process lifetime (e.g. until you restart a service, like nova-compute).
Transient here means they belong to a specific process, compared to
regular queues which may be used by more processes.
Usually, transients queues are the "fanout" and "reply" queues.
By default, without any rabbitmq policy tuning, they are not durable
neither highly available.
By enabling quorum for transients, oslo.messaging will declare quorum
queues instead of classic on rabbitmq. As a result, those queues will
automatically become HA and durable.
Note that this may have an impact on your cluster, as rabbit will need
more cpu, ram and network bandwith to manage the queues. This was tested
at pretty large scale (2k hypervisors) with a cluster of 5 nodes.
Also note that the current rabbitmq implementation rely on a fixed number
of "erlang atom" (5M by default), and one atom is consumed each time a
quorum queue is created with a different name. If your deployment is doing
a lot of queue deletion/creation, you may consume all your atoms quicker.
When enabling quorum for transients, you may also want to update your
rabbitmq policies accordingly (e.g. make sure they apply on quorum).
This option will stay disabled by default for now but may become the
default in the future.