Add doc entry to check partition count
An high or increasing partition count due to storing handoffs can have some severe side-effects, and replication might never be able to catch up. This patch adds a note to the admin_guide how to check this. Change-Id: Ib4e161d68f1a82236dbf5fac13ef9a13ac4bbf18
This commit is contained in:
parent
11c5ef7d22
commit
699953508a
@ -617,13 +617,90 @@ have 6 replicas in region 1.
|
|||||||
|
|
||||||
|
|
||||||
You should be aware that, if you have data coming into SF faster than
|
You should be aware that, if you have data coming into SF faster than
|
||||||
your link to NY can transfer it, then your cluster's data distribution
|
your replicators are transferring it to NY, then your cluster's data distribution
|
||||||
will get worse and worse over time as objects pile up in SF. If this
|
will get worse and worse over time as objects pile up in SF. If this
|
||||||
happens, it is recommended to disable write_affinity and simply let
|
happens, it is recommended to disable write_affinity and simply let
|
||||||
object PUTs traverse the WAN link, as that will naturally limit the
|
object PUTs traverse the WAN link, as that will naturally limit the
|
||||||
object growth rate to what your WAN link can handle.
|
object growth rate to what your WAN link can handle.
|
||||||
|
|
||||||
|
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
Checking handoff partition distribution
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
You can check if handoff partitions are piling up on a server by
|
||||||
|
comparing the expected number of partitions with the actual number on
|
||||||
|
your disks. First get the number of partitions that are currently
|
||||||
|
assigned to a server using the ``dispersion`` command from
|
||||||
|
``swift-ring-builder``::
|
||||||
|
|
||||||
|
swift-ring-builder sample.builder dispersion --verbose
|
||||||
|
Dispersion is 0.000000, Balance is 0.000000, Overload is 0.00%
|
||||||
|
Required overload is 0.000000%
|
||||||
|
--------------------------------------------------------------------------
|
||||||
|
Tier Parts % Max 0 1 2 3
|
||||||
|
--------------------------------------------------------------------------
|
||||||
|
r1 8192 0.00 2 0 0 8192 0
|
||||||
|
r1z1 4096 0.00 1 4096 4096 0 0
|
||||||
|
r1z1-172.16.10.1 4096 0.00 1 4096 4096 0 0
|
||||||
|
r1z1-172.16.10.1/sda1 4096 0.00 1 4096 4096 0 0
|
||||||
|
r1z2 4096 0.00 1 4096 4096 0 0
|
||||||
|
r1z2-172.16.10.2 4096 0.00 1 4096 4096 0 0
|
||||||
|
r1z2-172.16.10.2/sda1 4096 0.00 1 4096 4096 0 0
|
||||||
|
r1z3 4096 0.00 1 4096 4096 0 0
|
||||||
|
r1z3-172.16.10.3 4096 0.00 1 4096 4096 0 0
|
||||||
|
r1z3-172.16.10.3/sda1 4096 0.00 1 4096 4096 0 0
|
||||||
|
r1z4 4096 0.00 1 4096 4096 0 0
|
||||||
|
r1z4-172.16.20.4 4096 0.00 1 4096 4096 0 0
|
||||||
|
r1z4-172.16.20.4/sda1 4096 0.00 1 4096 4096 0 0
|
||||||
|
r2 8192 0.00 2 0 8192 0 0
|
||||||
|
r2z1 4096 0.00 1 4096 4096 0 0
|
||||||
|
r2z1-172.16.20.1 4096 0.00 1 4096 4096 0 0
|
||||||
|
r2z1-172.16.20.1/sda1 4096 0.00 1 4096 4096 0 0
|
||||||
|
r2z2 4096 0.00 1 4096 4096 0 0
|
||||||
|
r2z2-172.16.20.2 4096 0.00 1 4096 4096 0 0
|
||||||
|
r2z2-172.16.20.2/sda1 4096 0.00 1 4096 4096 0 0
|
||||||
|
|
||||||
|
As you can see from the output, each server should store 4096 partitions, and
|
||||||
|
each region should store 8192 partitions. This example used a partition power
|
||||||
|
of 13 and 3 replicas.
|
||||||
|
|
||||||
|
With write_affinity enabled it is expected to have a higher number of
|
||||||
|
partitions on disk compared to the value reported by the
|
||||||
|
swift-ring-builder dispersion command. The number of additional (handoff)
|
||||||
|
partitions in region r1 depends on your cluster size, the amount
|
||||||
|
of incoming data as well as the replication speed.
|
||||||
|
|
||||||
|
Let's use the example from above with 6 nodes in 2 regions, and write_affinity
|
||||||
|
configured to write to region r1 first. `swift-ring-builder` reported that
|
||||||
|
each node should store 4096 partitions::
|
||||||
|
|
||||||
|
Expected partitions for region r2: 8192
|
||||||
|
Handoffs stored across 4 nodes in region r1: 8192 / 4 = 2048
|
||||||
|
Maximum number of partitions on each server in region r1: 2048 + 4096 = 6144
|
||||||
|
|
||||||
|
Worst case is that handoff partitions in region 1 are populated with new
|
||||||
|
object replicas faster than replication is able to move them to region 2.
|
||||||
|
In that case you will see ~ 6144 partitions per
|
||||||
|
server in region r1. Your actual number should be lower and
|
||||||
|
between 4096 and 6144 partitions (preferably on the lower side).
|
||||||
|
|
||||||
|
Now count the number of object partitions on a given server in region 1,
|
||||||
|
for example on 172.16.10.1. Note that the pathnames might be
|
||||||
|
different; `/srv/node/` is the default mount location, and `objects`
|
||||||
|
applies only to storage policy 0 (storage policy 1 would use
|
||||||
|
`objects-1` and so on)::
|
||||||
|
|
||||||
|
find -L /srv/node/ -maxdepth 3 -type d -wholename "*objects/*" | wc -l
|
||||||
|
|
||||||
|
If this number is always on the upper end of the expected partition
|
||||||
|
number range (4096 to 6144) or increasing you should check your
|
||||||
|
replication speed and maybe even disable write_affinity.
|
||||||
|
Please refer to the next section how to collect metrics from Swift, and
|
||||||
|
especially :ref:`swift-recon -r <recon-replication>` how to check replication
|
||||||
|
stats.
|
||||||
|
|
||||||
|
|
||||||
--------------------------------
|
--------------------------------
|
||||||
Cluster Telemetry and Monitoring
|
Cluster Telemetry and Monitoring
|
||||||
--------------------------------
|
--------------------------------
|
||||||
@ -748,6 +825,8 @@ This information can also be queried via the swift-recon command line utility::
|
|||||||
Time to wait for a response from a server
|
Time to wait for a response from a server
|
||||||
--swiftdir=SWIFTDIR Default = /etc/swift
|
--swiftdir=SWIFTDIR Default = /etc/swift
|
||||||
|
|
||||||
|
.. _recon-replication:
|
||||||
|
|
||||||
For example, to obtain container replication info from all hosts in zone "3"::
|
For example, to obtain container replication info from all hosts in zone "3"::
|
||||||
|
|
||||||
fhines@ubuntu:~$ swift-recon container -r --zone 3
|
fhines@ubuntu:~$ swift-recon container -r --zone 3
|
||||||
|
Loading…
Reference in New Issue
Block a user