rabbitmq: set cluster_partition_handling to 'ignore'

The pause_minority strategy tends to cause more problems than it
solves.  If a partition is brief enough that no nodes are fenced, the
pausing and unpausing of minority nodes (especially during a partial
partition) frequently causes rabbitmq to crash in odd ways consistent
with race conditions.

By ignoring partitions, we will tolerate brief partitions better.
Longer partitions will be handled via fencing, which does not suffer
from race conditions when pausing/unpausing nodes.

Change-Id: Icb05c6b95a207c4ef818fb90fa9a2c041a5e85cf
This commit is contained in:
John Eckersberg 2017-10-06 08:11:26 -04:00
parent 4416428af3
commit 833e3baeb3
2 changed files with 7 additions and 1 deletions

View File

@ -103,7 +103,7 @@ outputs:
inet_dist_listen_min: '25672'
inet_dist_listen_max: '25672'
rabbitmq_config_variables:
cluster_partition_handling: 'pause_minority'
cluster_partition_handling: 'ignore'
queue_master_locator: '<<"min-masters">>'
loopback_users: '[]'
rabbitmq::erlang_cookie:

View File

@ -0,0 +1,6 @@
---
fixes:
- |
Changes the default RabbitMQ partition handling strategy from
'pause_minority' to 'ignore', avoiding crashes due to race
conditions with nodes starting and stopping concurrently.