From 5b0a281d5122d1806f2e689f5b5c7f48658b41d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ga=C3=ABtan=20Trellu?= <gaetan.trellu@incloudus.com>
Date: Wed, 24 Jul 2019 12:38:40 -0400
Subject: [PATCH] Set RabbitMQ cluster_partition_handling to pause_minority
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This is to avoid split-brain.

This change also adds relevant docs that sort out the
HA/quorum questions.

Change-Id: I9a8c2ec4dbbd0318beb488548b2cde8f4e487dc1
Closes-Bug: #1837761
Co-authored-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
---
 .../roles/rabbitmq/templates/rabbitmq.conf.j2 |  3 +-
 .../admin/production-architecture-guide.rst   | 40 +++++++++++++++++++
 ...q-partition-handling-5aebe0bf7e361239.yaml |  8 ++++
 3 files changed, 50 insertions(+), 1 deletion(-)
 create mode 100644 releasenotes/notes/change-rabbitmq-partition-handling-5aebe0bf7e361239.yaml

diff --git a/ansible/roles/rabbitmq/templates/rabbitmq.conf.j2 b/ansible/roles/rabbitmq/templates/rabbitmq.conf.j2
index e95b7cccab..39799fa400 100644
--- a/ansible/roles/rabbitmq/templates/rabbitmq.conf.j2
+++ b/ansible/roles/rabbitmq/templates/rabbitmq.conf.j2
@@ -2,7 +2,8 @@ listeners.tcp.1 = {{ api_interface_address }}:{{ role_rabbitmq_port }}
 {% if rabbitmq_hipe_compile|bool %}
 hipe_compile = true
 {% endif %}
-cluster_partition_handling = autoheal
+{# NOTE: to avoid split-brain #}
+cluster_partition_handling = pause_minority
 
 management.listener.ip = {{ api_interface_address }}
 management.listener.port = {{ role_rabbitmq_management_port }}
diff --git a/doc/source/admin/production-architecture-guide.rst b/doc/source/admin/production-architecture-guide.rst
index c7f8d9f97a..c9e1740f01 100644
--- a/doc/source/admin/production-architecture-guide.rst
+++ b/doc/source/admin/production-architecture-guide.rst
@@ -123,3 +123,43 @@ commits and rabbitmq.
 This becomes especially relevant when ``enable_central_logging`` and
 ``openstack_logging_debug`` are both set to true, as fully loaded 130 node
 cluster produced 30-50GB of logs daily.
+
+High Availability (HA) and scalability
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+HA is an important topic in production systems.
+HA concerns itself with redundant instances of services so that the overall
+service can be provided with close-to-zero interruption in case of failure.
+Scalability often works hand-in-hand with HA to provide load sharing by
+the use of load balancers.
+
+OpenStack services
+------------------
+
+Multinode Kolla Ansible deployments provide HA and scalability for services.
+OpenStack API endpoints are a prime example here: redundant ``haproxy``
+instances provide HA with ``keepalived`` while the backends are also
+deployed redundantly to enable both HA and load balancing.
+
+Other core services
+-------------------
+
+The core non-OpenStack components required by most deployments: the SQL
+database provided by ``mariadb`` and message queue provided by
+``rabbitmq`` are also deployed in a HA way. Care has to be taken, however,
+as unlike previously described services, these have more complex HA
+mechanisms. The reason for that is that they provide the central, persistent
+storage of information about the cloud that each other service assumes to
+have a consistent state (aka integrity).
+This assumption leads to the requirement of quorum establishment
+(look up the CAP theorem for greater insight).
+
+Quorum needs a majority vote and hence deploying 2 instances of these
+do not provide (by default) any HA as a failure of one causes a failure
+of the other one. Hence the recommended number of instances is ``3``,
+where 1 node failure is acceptable. For scaling purposes and better
+resilience it is possible to use ``5`` nodes and have 2 failures
+acceptable.
+Note, however, that higher numbers usually provide no benefits due to amount
+of communication between quorum members themselves and the non-zero
+probability of the communication medium failure happening instead.
diff --git a/releasenotes/notes/change-rabbitmq-partition-handling-5aebe0bf7e361239.yaml b/releasenotes/notes/change-rabbitmq-partition-handling-5aebe0bf7e361239.yaml
new file mode 100644
index 0000000000..28028cc8a9
--- /dev/null
+++ b/releasenotes/notes/change-rabbitmq-partition-handling-5aebe0bf7e361239.yaml
@@ -0,0 +1,8 @@
+---
+upgrade:
+  - Set RabbitMQ ``cluster_partition_handling`` to ``pause_minority``.
+    This is to avoid split-brain.
+    The setting is overridable using custom config.
+    Note this new config requires at least 3-node RabbitMQ cluster
+    to provide HA (High Availability).
+    See production architecture guide for more info.