48 Commits

Author SHA1 Message Date
Samuel Merritt
ccf0758ef1 Add ring-builder analyzer.
This is a tool to help developers quantify changes to the ring
builder. It takes a scenario (JSON file) describing the builder's
basic parameters (part_power, replicas, etc.) and a number of
"rounds", where each round is a set of operations to perform on the
builder. For each round, the operations are applied, and then the
builder is rebalanced until it reaches a steady state.

The idea is that a developer observes the ring builder behaving
suboptimally, writes a scenario to reproduce the behavior, modifies
the ring builder to fix it, and references the scenario with the
commit so that others can see that things have improved.

I decided to write this after writing my fourth or fifth hacky one-off
script to reproduce some bad behavior in the ring builder.

Change-Id: I114242748368f142304aab90a6d99c1337bced4c
2015-07-02 08:16:03 -07:00
Mark Seger
af734b3fb6 Change usage help and Attention messages to warnings
Change-Id: I1396aaffe36e739606f15f7fef37b11bd83f1fc1
2015-06-03 15:32:25 -04:00
Christian Schwede
b4c1d73ad5 Make swift-recon compatible for servers without storage policies
Swift recon introduced a new key for storage policies, and the CLI expected this
key in the server response. However, if one updates the CLI but not yet the
server an exception will be raised, because there is no default value and no
check if the key is included in the response.

This change checks if the policies key is included in the response and updates
one test to ensure backward compability.

Closes-Bug: 1453599

Change-Id: I7c7a90f9933bec2ab45595df9dc600a6cba65666
2015-06-01 07:00:39 +00:00
Jenkins
ad66801915 Merge "More user-friendly output for object metadata" 2015-04-16 22:15:05 +00:00
Ricardo Ferreira
57011d5699 More user-friendly output for object metadata
Split out system, user and other metadata in swift-object-info. Print
every position line by line instead of raw dict representation, so it
would be easier to parse with tools such as grep.

Co-Authored-By: Ricardo Ferreira <ricardo.sff@gmail.com>
Co-Authored-By: Kamil Rykowski <kamil.rykowski@intel.com>

Change-Id: Ia78da518c18f7e26016700aee87efb534fbd2040
Closes-Bug: #1428866
2015-04-16 09:20:14 +02:00
Christian Schwede
663ccd4e7a More tests for swift recon
Change-Id: I8d568c0f6fbe1c01d97491740aebf299deb63732
2015-04-15 06:36:06 +00:00
Christian Schwede
e16df14a73 Add test for swift_recon.disk_usage
Change-Id: I4cab7aa6df3f0e1933e52ee5dbbb829f30604f10
2015-04-14 16:00:37 -07:00
Christian Schwede
d3213fb1fe Check if device name is valid when adding to the ring
Currently device names can be empty or start and/or end with spaces.
This can create unexpected results, for example these three commands
are all valid:

swift-ring-builder account.builder add "r1z1-127.0.0.1:6000/" 1
swift-ring-builder account.builder add "r1z1-127.0.0.1:6000/sda " 1
swift-ring-builder account.builder add "r1z1-127.0.0.1:6000/ meta" 1

This patch validates device names and prevents empty names or names
starting and/or ending with spaces.

Also fixed the test "test_warn_at_risk" - the test passed if the
exception was not raised.

Closes-Bug: 1438579

Change-Id: I811b0eae7db503279e6429d985275bbab8b29c9f
2015-04-14 13:15:15 -07:00
Samuel Merritt
8d3b3b2ee0 Add some debug output to the ring builder
Sometimes, I get handed a builder file in a support ticket and a
question of the form "why is the balance [not] doing $thing?". When
that happens, I add a bunch of print statements to my local
swift/common/ring/builder.py, figure things out, and then delete the
print statements. This time, instead of deleting the print statements,
I turned them into debug() calls and added a "--debug" flag to the
rebalance command in hopes that someone else will find it useful.

Change-Id: I697af90984fa5b314ddf570280b4585ba0ba363c
2015-03-30 17:47:28 -07:00
Jenkins
3973eb38e0 Merge "Add swift-recon feature to track swift-drive-audit error count" 2015-03-24 10:59:24 +00:00
Lorcan
0a46793662 Add swift-recon feature to track swift-drive-audit error count
This is a follow-on from a previous commit which added recon info
for swift-drive-audit (https://review.openstack.org/#/c/122468/).

Here, the "--drievaudit" option is added to swift-recon tool. This
feature gives the statistics for the system-wide drive errors flagged
by swift-drive-audit. An example of the output is as follows:
(verbose mode)

swift-recon --driveaudit -v
===============================================================================
--> Starting reconnaissance on 5 hosts
===============================================================================
[2015-03-11 17:13:39] Checking drive-audit errors
-> http://1.2.3.4:6000/recon/driveaudit: {'drive_audit_errors': 14}
-> http://1.2.3.5:6000/recon/driveaudit: {'drive_audit_errors': 0}
-> http://1.2.3.6:6000/recon/driveaudit: {'drive_audit_errors': 37}
-> http://1.2.3.7:6000/recon/driveaudit: {'drive_audit_errors': 101}
-> http://1.2.3.8:6000/recon/driveaudit: {'drive_audit_errors': 0}
[drive_audit_errors] low: 0, high: 101, avg: 30.4, total: 152, Failed: 0.0%, no_result: 0, reported: 5
===============================================================================

Change-Id: Ia16c52a9d613eeb3de1a5a428d88dd1233631912
2015-03-23 11:38:32 +00:00
Alistair Coles
15b83f67d2 Test swift-object-info opens meta and ts files
Adds a unit test to verify the change made in [1], i.e. that
swift-object-info will read from .meta and .ts files as well
as .data files.

[1] change I43966d371218ad39414e9282cde579e48370a2a7

Change-Id: I82dde36e3a96db1a21cfe9a4cca0d941e543dfd0
Related-Bug: 1425679
2015-03-18 12:43:03 +00:00
Mahati Chamarthy
a248a5c09e Ring checker in swift-recon
This patch validates the server end points on the ring. And also generates
a report on issues found.

Change-Id: I913799a35d5c9178164021cfb7fcb448141b058b
2015-02-26 01:26:02 +05:30
Jenkins
947f979dee Merge "Show each policy's information on quarantined files in recon" 2015-02-13 00:17:53 +00:00
Jenkins
bc7c496f71 Merge "Allow hostnames for nodes in Rings" 2015-02-10 04:32:38 +00:00
Hisashi Osanai
efb39a5665 Allow hostnames for nodes in Rings
This change modifies the swift-ring-builder and introduces new format
of sub-commands (search, list_parts, set_weight, set_info and remove)
in addition to add sub-command so that hostnames can be used in place
of an ip-address for the sub-commands.
The account reaper, container synchronizer, and replicators were also
updated so that they still have a way to identify a particular device
as being "local".

Previously this was Change-Id:
Ie471902413002872fc6755bacd36af3b9c613b74

Change-Id: Ieff583ffb932133e3820744a3f8f9f491686b08d
Co-Authored-By: Alex Pecoraro <alex.pecoraro@emc.com>
Implements: blueprint allow-hostnames-for-nodes-in-rings
2015-02-02 05:06:03 +09:00
sarvesh-ranjan
d8fdbc2b2d Typos fixed
Change-Id: I2c216a870ce299039dec9948dcdef3de0721b4da
2015-01-29 18:20:31 -08:00
Jenkins
df529a225f Merge "Allow set_overload to take value as percent" 2015-01-29 06:19:22 +00:00
Pete Zaitcev
562f7e8906 Allow set_overload to take value as percent
...and output overload as a percent like dispersion and balance.

Also raise a warning if someone tries to set overload higher than 100%
(unless the specifically requested a percent value great than 100).

Change-Id: Id030123153ea746671a8f1ca306d4b86e903fa22
2015-01-28 13:49:55 -08:00
Clay Gerrard
376dc5adc3 don't print cached dispersion if it's a lie
Change-Id: I551fcaf274876861feb12848749590f220842d68
2015-01-27 10:19:41 -08:00
Daisuke Morita
f8fa1a9234 Show each policy's information on quarantined files in recon
After the release of Swift ver. 2.0.0, some recon responses do not
show each policy's information yet. To make things worse, some recon
results only count on policy-0's score, therefore the total is not
shown in the recon results.

This patch makes the count of quarantined files policy-aware for recon
requests. Suppose a number of quarantined objects for policy-0 is 2
and a number for policy-1 is 3, recon sums up every policy's amount
and shows information for each policy as follows.

$ curl http://<host>:<port>/recon/quarantined
{"accounts": 0, "containers": 0, "objects": 5, "policies": {"0":
{"objects": 2}, "1": {"objects": 3}}}

Moreover, this patch adds stats for each policy in CLI output.

Change-Id: I07217c635f6fc4ea809ddbc3d859c4e81c4fde37
Related-Bug: 1375327
Related-Bug: 1375332
2015-01-20 18:42:20 +09:00
Clay Gerrard
a8bd2f737c Add dispersion command to swift-ring-builder
Output a dispersion report that shows how many parts have each replica count
at each tier along with some additional context.  Also the max_dispersion is a
good canary for what a reasonable overload might be.

Also display a warning on rebalance if the ring's dispersion is sub-optimal.

The primitive form of the dispersion graph is cached on the builder, but the
dispersion command will build it on the fly if you have a ring that was last
rebalanced before the change.

Also add --force option to rebalance to make it write a ring even if less than
1% of parts moved.

Try to clarify some dispersion and balance a little bit in the ring section of
the architectural overview.

Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Darrell Bishop <darrell@swiftstack.com>

Change-Id: I7696df25d092fac56588080722e0a4167ed2c824
2015-01-08 18:40:27 -08:00
Samuel Merritt
bcf26f5209 Add notion of overload to swift-ring-builder
The ring builder's placement algorithm has two goals: first, to ensure
that each partition has its replicas as far apart as possible, and
second, to ensure that partitions are fairly distributed according to
device weight. In many cases, it succeeds in both, but sometimes those
goals conflict. When that happens, operators may want to relax the
rules a little bit in order to reach a compromise solution.

Imagine a cluster of 3 nodes (A, B, C), each with 20 identical disks,
and using 3 replicas. The ring builder will place 1 replica of each
partition on each node, as you'd expect.

Now imagine that one disk fails in node C and is removed from the
ring. The operator would probably be okay with remaining at 1 replica
per node (unless their disks are really close to full), but to
accomplish that, they have to multiply the weights of the other disks
in node C by 20/19 to make C's total weight stay the same. Otherwise,
the ring builder will move partitions around such that some partitions
have replicas only on nodes A and B.

If 14 more disks failed in node C, the operator would probably be okay
with some data not living on C, as a 4x increase in storage
requirements is likely to fill disks.

This commit introduces the notion of "overload": how much extra
partition space can be placed on each disk *over* what the weight
dictates.

For example, an overload of 0.1 means that a device can take up to 10%
more partitions than its weight would imply in order to make the
replica dispersion better.

Overload only has an effect when replica-dispersion and device weights
come into conflict.

The overload is a single floating-point value for the builder
file. Existing builders get an overload of 0.0, so there will be no
behavior change on existing rings.

In the example above, imagine the operator sets an overload of 0.112
on his rings. If node C loses a drive, each other drive can take on up
to 11.2% more data. Splitting the dead drive's partitions among the
remaining 19 results in a 5.26% increase, so everything that was on
node C stays on node C. If another disk dies, then we're up to an
11.1% increase, and so everything still stays on node C. If a third
disk dies, then we've reached the limits of the overload, so some
partitions will begin to reside solely on nodes A and B.

DocImpact

Change-Id: I3593a1defcd63b6ed8eae9c1c66b9d3428b33864
2015-01-07 14:16:08 -08:00
Hisashi Osanai
d742b610df Fix the behavior of swift-ring-builder list_parts before rebalance
The swift-ring-builder list_parts before rebalance failed abnormally so
this patch fix the behavior. After this patch applies the behavior is
completion normally with the following messages.

Specified builder file "<builder_file>" is not rebalanced yet.
Please rebalance first.

Closes-Bug: #1399529
Change-Id: I9e5db6da85de4188915c51bc401604733f0e1b77
2014-12-06 02:44:59 +09:00
Jenkins
b9f08a2be8 Merge "Provides proper error handling on builder unpickle" 2014-10-04 01:31:00 +00:00
Jenkins
513eeb80d7 Merge "updated hacking rules" 2014-10-03 01:47:40 +00:00
Keshava Bharadwaj
0f93fff46a Fixes unit tests to clean up temporary directories
This patch fixes the unit tests to remove the temporary directories
created during run of unit tests. Some of unit tests did not tear down
correctly, whatever it had set it up for running. This would over period
of time bloat up the tmp directory. As on date, there were around 49 tmp
directories left uncleared per round of unit tests. This patch fixes it.

Change-Id: If591375ca9cc87d52c7c9c6dc16c9fb4b49e99fc
2014-09-26 22:39:48 +05:30
Keshava Bharadwaj
38ba5790fb Provides proper error handling on builder unpickle
This patch provides the necessary error handling while unpickling
a builder file. Earlier if a builder file is empty/invalid/corrupted,
the stacktrace was shown to user with an exit code of 1. This fixes it
to show a user-friendly message and also returns the exit code of 2,
indicating there was a failure.

Change-Id: I51eb24702c422299629f8053d4591dd10f5863f8
Closes-Bug: #1370680
2014-09-26 09:44:35 +05:30
John Dickinson
e567722c4e updated hacking rules
1) Added comment for H231, which we were already enforcing. H231
is for Python 3.x compatible except statements.

2) Added check for H201, which we were enforcing in reviews
but waiting on hacking checks to be updated. H201 is for bare
except statements, and the update in upstream hacking is to
support the "  # noqa" flag on it.

The H201 check catches some existing bare excepts that are fixed.

Change-Id: I68638aa9ea925ef62f9035a426548c2c804911a8
2014-09-25 11:04:31 -07:00
Christian Schwede
fc5cee5f05 Allow filtering by region in swift-recon
The option "-r" is already used, thus only "--region" is used to specify
filter by region.

Change-Id: If769f2f3191c202933b03b48fe0f22b7c94a4dd6
Closes-Bug: 1369583
2014-09-15 17:31:16 +00:00
Samuel Merritt
2b55709625 Make swift-form-signature output a sample form
swift-form-signature would give you the required expiration-time and
HMAC signature, but it wouldn't help you actually construct the HTML
form. To do that, you had to go look at the formpost middleware's doc
string and make up a form yourself.

For convenience, this commit makes swift-form-signature output a
sample form with the computed values filled in already; the user only
needs to fill in the Swift cluster's hostname.

Change-Id: I70d70a648b78b382dbfbe8ff918e6158a7f6a0ab
2014-08-05 11:11:03 -07:00
Samuel Merritt
4f2bb9f271 Make swift-form-signature testable
Moved the body of bin/swift-form-signature into
swift/cli/form_signature.py, like was done with swift-ring-builder and
others. Added a couple basic tests; there's not 100% coverage, but
it's better than the 0% coverage we had before.

It's almost a straight forklift, but I changed exit() calls to return
statements.

Change-Id: Ie2f702c070da24d9cdface83b9e838e9e2965085
2014-07-24 14:38:53 -07:00
Christian Berendt
f6ff06b678 Use except x as y instead of except x, y
According to https://docs.python.org/3/howto/pyporting.html the
syntax changed in Python 3.x. The new syntax is usable with
Python >= 2.6 and should be preferred to be compatible with Python3.

Enabled hacking check H231.

Change-Id: I2c41dc3ec83e79181e8fd50e76771a74c393269c
2014-07-07 15:42:13 -07:00
Christian Schwede
e21703ff7b Add test for swift-recon --auditor
Related-Bug: 1329785
Change-Id: I47cecd8a4cd55ca75c2a51153be7bb61c27d0ea0
2014-06-25 14:07:52 +00:00
Jenkins
570e50fe22 Merge "Change assertCalledWith to assert_called_with" 2014-06-24 05:04:06 +00:00
Clay Gerrard
c1dc2fa624 Add two vector timestamps
The normalized form of the X-Timestamp header looks like a float with a fixed
width to ensure stable string sorting - normalized timestamps look like
"1402464677.04188"

To support overwrites of existing data without modifying the original
timestamp but still maintain consistency a second internal offset
vector is append to the normalized timestamp form which compares and
sorts greater than the fixed width float format but less than a newer
timestamp.  The internalized format of timestamps looks like
"1402464677.04188_0000000000000000" - the portion after the underscore
is the offset and is a formatted hexadecimal integer.

The internalized form is not exposed to clients in responses from Swift.
Normal client operations will not create a timestamp with an offset.

The Timestamp class in common.utils supports internalized and normalized
formatting of timestamps and also comparison of timestamp values.  When the
offset value of a Timestamp is 0 - it's considered insignificant and need not
be represented in the string format; to support backwards compatibility during
a Swift upgrade the internalized and normalized form of a Timestamp with an
insignificant offset are identical.  When a timestamp includes an offset it
will always be represented in the internalized form, but is still excluded
from the normalized form.  Timestamps with an equivalent timestamp portion
(the float part) will compare and order by their offset.  Timestamps with a
greater timestamp portion will always compare and order greater than a
Timestamp with a lesser timestamp regardless of it's offset.  String
comparison and ordering is guaranteed for the internalized string format, and
is backwards compatible for normalized timestamps which do not include an
offset.

The reconciler currently uses a offset bump to ensure that objects can move to
the wrong storage policy and be moved back.  This use-case is valid because
the content represented by the user-facing timestamp is not modified in way.
Future consumers of the offset vector of timestamps should be mindful of HTTP
semantics of If-Modified and take care to avoid deviation in the response from
the object server without an accompanying change to the user facing timestamp.

DocImpact
Implements: blueprint storage-policies
Change-Id: Id85c960b126ec919a481dc62469bf172b7fb8549
2014-06-19 10:18:06 -07:00
Paul Luse
8326dc9f2a Add Storage Policy Support to Recon Middleware
Recon middleware returns object ring file MD5 sums; this patch
updates it to include other object files that may be present
because of Storage Policies.  Also adds unit test coverage for
the MD5 reporting function which previously had none.

The recon script will now check all rings the server responds with
match the on-disk md5's regardless of server-type; including any
storage policy object rings.

Note the small change to the ring save method, needed to
stimulate the right code paths in 2.6 and 2.7 versions of
gzip to enable testing of ring MD5 sums.

DocImpact
Implements: blueprint storage-policies
Change-Id: I01efd2999d6d9c57ee8693ac3a6236ace17c5566
2014-06-18 21:09:54 -07:00
Yuan Zhou
6cc10d17de Update bin scripts to be storage policy aware
swift-container-info:
    Print policy container info

swift-object-info:
    Allow to specify storage policy name when looking for object info
    Notify if there is missmatch between ring location and the actual
    object path in filesystem

swift-get-nodes:
    Allow to specify storage policy name when looking for account/
    container/object ring location
    Notify if there is missmatch between ring and the policy

Lookup policy name in swift.conf; 'Legacy' container will use
policy-0's name; 'Unknown' is shown if policy not found in swift.conf

DocImpact
Implements: blueprint storage-policies
Change-Id: I450d40dc6e2d8f759187dff36d658e52737ae2a5
2014-06-18 20:57:09 -07:00
Clay Gerrard
7624b198cf Update FakeRing and FakeLogger
FakeLogger gets better log level handling

Parameterize logger on some daemons which were previously
unparameterized and try and use the interface in tests.

FakeRing use more real code

The existing FakeRing mock's implementation bit me on some pretty subtle
character encoding issue by-passing the hash_path code that is normally
part of get_part_nodes.  This change tries to exercise more of the real
ring code paths when it makes sense and provide a better Fake for use in
testing.

Add write_fake_ring helper to test.unit for when you need a real ring.

DocImpact
Implements: blueprint storage-policies
Change-Id: Id2e3740b1dd569050f4e083617e7dd6a4249027e
2014-06-18 17:31:37 -07:00
Paul Luse
d32169c9c8 Change assertCalledWith to assert_called_with
In test_ptime() 2 uses of the former failed to assert with bogus
values used in the assert.  Using assert_called_with() instead
correctly performs the assertion.

Change-Id: Idbd918f8e8d3ec5a4110725b949710fb54b4ba9a
2014-06-10 08:08:13 -07:00
Jenkins
fc7f6dd924 Merge "Check swift.conf MD5 with recon" 2014-04-23 03:23:47 +00:00
Paul Luse
856c15539a Fix testcase test_print_db_info_metadata()
Test compares cluster info to hardcoded expected data and wasn't
sorting the two sets of things being compared leading to some
sporadic unit test failures.

Change-Id: I3ef98260a62c15d06ba8cc196196d4e90abca3f0
2014-04-14 16:14:30 -07:00
Madhuri Kumari
67fff5b297 Print 'Container Count' in data base info
Currently, 'Container Count' was missing in data base info.
So this patch will help printing 'Container Count' also.

Change-Id: I1ca80ee79e71b086b30fd2d1ab024ea1cfb324f5
2014-04-12 09:20:06 +05:30
Samuel Merritt
31dac18625 Check swift.conf MD5 with recon
I've seen several folks recently have problems with their Swift
clusters because they had different hash prefixes on different
nodes. Let's help them out by having recon check that.

Note that MD5-equality is stronger than what we need (which is
ConfigParser-equality for a particular set of keys), but this way we
don't expose the secret hash prefix and suffix across the internal
network, just the MD5 checksum of the file containing them.

Change-Id: I3af984ee45947345891b3c596a88e3464f178cc7
2014-04-10 14:08:27 -07:00
Peter Portante
de020f0189 Handle getting info on wrong database type
Change-Id: I32f66f6a7683180a18a2807143d0910c75bf16f0
2014-04-03 15:56:25 -04:00
Yuan Zhou
39f5eab890 Clean up swift-{account, container}-info
Reuse common code; add unit tests; ensured coverage was at 100%.

Change-Id: Id6fcc7cb07fd178e00d43968e3e2cc03226fdc05
2014-04-03 09:54:58 -04:00
Christian Schwede
24657b2b39 Add tests for swift-ring-builder
Add some tests for essential methods in swift-ring-builder.
Tests for removing or changing device settings are executed
with different search values to cover many possible command
line arguments.

Currently tested methods:

- create ring
- add device
- remove device
- set weight
- set info
- set min_part_hours
- set replicas

Tests use swift.common.ring.RingBuilder to verify actions.

Catching and testing output from print statements is not
tested, because this requires redirecting sys.stdout during
tests and that might have some sideeffects for testing tools.

bin/swift-ring-builder has been moved to swift/cli/ringbuilder.py
and slightly modified to work as before (mainly due to no more
existing global variables since that part of the code has been
moved inside a main() function).

Change-Id: Ia63f59a8faca1fad990784f27532ca07a2125454
2014-02-05 16:20:09 +00:00
Christian Schwede
cd4b4da8b6 Add some tests for bin/swift-recon
Fix also minor bug in zone filtering when zone set to 0.

Moved bin/swift-recon to swift/cli/recon.py, which makes
it possible to import it without using some scary hacks.
bin/swift-recon is now created by setup.py install.

Closes-Bug: #1261692
Change-Id: Id0729991c8ece73604467480dbf93fec7d8eb196
2014-01-31 15:34:37 +00:00