The replica placement algorithm works on regions, then zones, then
IP/port, then device ID. The handoff algorithm worked on regions, then
zones, then device ID, completely skipping IP/port. It's now been
updated to take IP/port into consideration.
This means you get one handoff on each machine in the cluster before
you start getting handoffs that share a machine with a previous
one. In small clusters, this can help with durability.
Because this is performance-critical code, here are some quick
benchmark results:
Run time averages over 25000 trials on a 1200-device ring (20 part
power, 3 replicas, 2 regions, 10 zones, 120 nodes):
| master | branch
===================+=============+============
get 1 more node | 2.727e-05 | 3.076e-05
get 6 more nodes | 3.55e-05 | 4.214e-05
get all more nodes | 0.002247 | 0.002691
There's a small slowdown from the additional bookkeeping, but nothing
too awful. The time to get 6 more nodes (for handoff checks on 404,
it's 2x replica count by default, hence 6) went from 35 to 42
microseconds, so it remains small.
Change-Id: Ie7da4dfcb0fcf1a38e2fb13f60c204540fadbf06