Fix intermittent problem in part_swapping test
There is an intermittent failure in the test_part_swapping_problem test found in test/unit/common/ring/test_builder.py. The test does a rebalance, then changes the ring to test a specific problem, does some housekeeping and then rebalances again. The problem is the ringbuilder keeps track of where in the ring it started the last ring rebalance, saved in `_last_part_gather_start`. On a rebalance, or more specifically in `_gather_parts_for_balance` we then we will start somewhere on the other side of the ring with: quarter_turn = (self.parts // 4) random_half = random.randint(0, self.parts / 2) start = (self._last_part_gather_start + quarter_turn + random_half) % self.parts Because we don't reset `_last_part_gather_start` when we change the ring in the test, there is edge case where if we are unlucky during both rebalances whereby both calls to randint returns a relatively large number pushes the start of the second rebalance to the wrong side of the ring. Actually it's more problematic, and only 1 large random and a one in the middle will cause it, maybe pictures help: rabalance 1 (r1): quarter_turn = 4, random_half = 5 rebalance 2 (r2): quarter_turn = 4, random_half = 3 r1 r2 | | v v array('H', [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1]), array('H', [1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 2, 2, 2, 3, 3, 3]), array('H', [2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4])] Now when gathering for rebalance 2 it'll pick: array('H', [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, X]), array('H', [X, X, 1, 1, 2, 2, 2, 3, 3, 3, 2, 2, 2, 3, 3, 3]), array('H', [2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4])] Which can cause the 3 attempts to gather and rebalance to be used up. This causes the intermittent failure seen in the bug. This patch solves this by resetting `_gather_parts_for_balance` to 0 while we tidy up the ring change. Meaning we'll always start on the correct side of the ring. Change-Id: I0d3a69620d4734091dfa516efd0d6b2ed87e196b Closes-Bug: #1724356
This commit is contained in:
parent
b86bf15a64
commit
a7da223262
@ -1072,6 +1072,11 @@ class TestRingBuilder(unittest.TestCase):
|
||||
new_dev_parts[dev_id] += 1
|
||||
for dev in rb._iter_devs():
|
||||
dev['parts'] = new_dev_parts[dev['id']]
|
||||
# reset the _last_part_gather_start otherwise
|
||||
# there is a chance it'll unluckly wrap and try and
|
||||
# move one of the device 1's from replica 2
|
||||
# causing the intermitant failure in bug 1724356
|
||||
rb._last_part_gather_start = 0
|
||||
|
||||
rb.pretend_min_part_hours_passed()
|
||||
rb.rebalance()
|
||||
|
Loading…
x
Reference in New Issue
Block a user