When looking at containers and accounts it's sometimes nice to know who
they've been replicating with. This patch adds a `--sync|-s` option to
swift-{container|account}-info which will also dump the incoming and
outgoing sync tables:
$ swift-container-info /srv/node3/sdb3/containers/294/624/49b9ff074c502ec5e429e7af99a30624/49b9ff074c502ec5e429e7af99a30624.db -s
Path: /AUTH_test/new
Account: AUTH_test
Container: new
Deleted: False
Container Hash: 49b9ff074c502ec5e429e7af99a30624
Metadata:
Created at: 2022-02-16T05:34:05.988480 (1644989645.98848)
Put Timestamp: 2022-02-16T05:34:05.981320 (1644989645.98132)
Delete Timestamp: 1970-01-01T00:00:00.000000 (0)
Status Timestamp: 2022-02-16T05:34:05.981320 (1644989645.98132)
Object Count: 1
Bytes Used: 7
Storage Policy: default (0)
Reported Put Timestamp: 1970-01-01T00:00:00.000000 (0)
Reported Delete Timestamp: 1970-01-01T00:00:00.000000 (0)
Reported Object Count: 0
Reported Bytes Used: 0
Chexor: 962368324c2ca023c56669d03ed92807
UUID: f33184e7-56d5-4c74-9d2e-5417c187d722-sdb3
X-Container-Sync-Point2: -1
X-Container-Sync-Point1: -1
No system metadata found in db file
No user metadata found in db file
Sharding Metadata:
Type: root
State: unsharded
Incoming Syncs:
Sync Point Remote ID Updated At
1 ce7268a1-f5d0-4b83-b993-af17b602a0ff-sdb1 2022-02-16T05:38:22.000000 (1644989902)
1 2af5abc0-7f70-4e2f-8f94-737aeaada7f4-sdb4 2022-02-16T05:38:22.000000 (1644989902)
Outgoing Syncs:
Sync Point Remote ID Updated At
Partition 294
Hash 49b9ff074c502ec5e429e7af99a30624
As a follow up to the device in DB ID patch we can see that the replicas
at sdb1 and sdb4 have replicated with this node.
Change-Id: I23d786e82c6710bea7660a9acf8bbbd113b5b727
Currently we make no effort in formatting the meta and sysmeta
inside a db when we use the info (swift-{container,account}-info
tools. This patch properly formats them so they are easier to parse by human eyes.
Change-Id: I5c3d260d677c61213c42662c6641207d9b7f026a
Replicated, unencrypted metadata is written down differently on py2
vs py3, and has been since we started supporting py3. Fortunately,
we can inspect the raw xattr bytes to determine whether the pickle
was written using py2 or py3, so we can properly read legacy py2 meta
under py3 rather than hitting a unicode error.
Closes-Bug: #2012531
Change-Id: I5876e3b88f0bb1224299b57541788f590f64ddd4
--path-as-is is not supported by older versions of curl (prior to
7.42.0) so the curl commands printed by swift-get-nodes must cut,
pasted *and then edited* before they can be run with older
curl. Moving the option to the end of the command line makes it
possible to just cut (omitting the last option) and paste the
commands.
Note: the --path-as-is option is not dropped altogether based on
testing the curl version available because the host on which
swift-get-nodes is executed might have a different curl version to the
host on which the curl command is to be executed. The presence of the
option also serves as a reminder that it (and newer curl) might be
needed with some paths.
Change-Id: Ifaf3cf97e6410d4b8818042a5082177418d6c6a2
The current behavior is really painful when you've got hundreds of shard
ranges in a DB. The new summary with the states is default. Users can
add a -v/--verbose flag to see the old full detail view.
Change-Id: I0a7d65f64540f99514c52a70f9157ef060a8a892
If an object name has something like /./ in it then curl will
resolve this. Need to use --path-as-is option for curl
Change-Id: I4e45cb62d41f6aada4fdbb00d86b4bd737b441ee
Closes-Bug: #1885244
Now that we can have null bytes in Swift paths, we need a way for
operators to be able to locate such containers and objects. Our usual
trick of making sure the name is properly quoted for the shell won't
suffice; running something like
swift-get-nodes /etc/swift/container.ring.gz $'AUTH_test/\0versions\0container'
has the path get cut off after "AUTH_test/" because of how argv works.
So, add a new option, --quoted, to let operators indicate that they
already quoted the path.
Drive-bys:
* If account, container, or object are explicitly blank, treat them
as though they were not provided. This provides better errors when
account is explicitly blank, for example.
* If account, container, or object are not provided or explicitly
blank, skip printing them. This resolves abiguities about things
like objects whose name is actually "None".
* When displaying account, container, and object, quote them (since
they may contain newlines or other control characters).
Change-Id: I3d10e121b403de7533cc3671604bcbdecb02c795
Related-Change: If912f71d8b0d03369680374e8233da85d8d38f85
Closes-Bug: #1875734
Closes-Bug: #1875735
Closes-Bug: #1875736
Related-Bug: #1791302
This started with ShardRanges and its CLI. The sharder is at the
bottom of the dependency chain. Even container backend needs it.
Once we started tinkering with the sharder, it all snowballed to
include the rest of the container services.
Beware, this does affect some of Python 2 code. Mostly it's trivial
and obviously correct, but needs checking by reviewers.
About killing the stray "from __future__ import unicode_literals":
we do not do it in general. The specific problem it caused was
a failure of functional tests because unicode leaked into a field
that was supposed to be encoded. It is just too hard to track the
types when rules change from file to file, so off with its head.
Change-Id: Iba4e65d0e46d8c1f5a91feb96c2c07f99ca7c666
With this patch the ContainerBroker gains several new features:
1. A shard_ranges table to persist ShardRange data, along with
methods to merge and access ShardRange instances to that table,
and to remove expired shard ranges.
2. The ability to create a fresh db file to replace the existing db
file. Fresh db files are named using the hash of the container path
plus an epoch which is a serialized Timestamp value, in the form:
<hash>_<epoch>.db
During sharding both the fresh and retiring db files co-exist on
disk. The ContainerBroker is now able to choose the newest on disk db
file when instantiated. It also provides a method (get_brokers()) to
gain access to broker instance for either on disk file.
3. Methods to access the current state of the on disk db files i.e.
UNSHARDED (old file only), SHARDING (fresh and retiring files), or
SHARDED (fresh file only with shard ranges).
Container replication is also modified:
1. shard ranges are replicated between container db peers. Unlike
objects, shard ranges are both pushed and pulled during a REPLICATE
event.
2. If a container db is capable of being sharded (i.e. it has a set of
shard ranges) then it will no longer attempt to replicate objects to
its peers. Object record durability is achieved by sharding rather than
peer to peer replication.
Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: Ie4d2816259e6c25c346976e181fb9d350f947190
Bring under test
- test/unit/cli/test_dispersion_report.py
- test/unit/cli/test_info.py and
- test/unit/cli/test_relinker.py
I've verified that swift-*-info (at least) behave reasonably under
py3, even swift-object-info when there's non-utf8 metadata on the
data/meta file.
Change-Id: Ifed4b8059337c395e56f5e9f8d939c34fe4ff8dd
Currently, our integrity checking for objects is pretty weak when it
comes to object metadata. If the extended attributes on a .data or
.meta file get corrupted in such a way that we can still unpickle it,
we don't have anything that detects that.
This could be especially bad with encrypted etags; if the encrypted
etag (X-Object-Sysmeta-Crypto-Etag or whatever it is) gets some bits
flipped, then we'll cheerfully decrypt the cipherjunk into plainjunk,
then send it to the client. Net effect is that the client sees a GET
response with an ETag that doesn't match the MD5 of the object *and*
Swift has no way of detecting and quarantining this object.
Note that, with an unencrypted object, if the ETag metadatum gets
mangled, then the object will be quarantined by the object server or
auditor, whichever notices first.
As part of this commit, I also ripped out some mocking of
getxattr/setxattr in tests. It appears to be there to allow unit tests
to run on systems where /tmp doesn't support xattrs. However, since
the mock is keyed off of inode number and inode numbers get re-used,
there's lots of leakage between different test runs. On a real FS,
unlinking a file and then creating a new one of the same name will
also reset the xattrs; this isn't the case with the mock.
The mock was pretty old; Ubuntu 12.04 and up all support xattrs in
/tmp, and recent Red Hat / CentOS releases do too. The xattr mock was
added in 2011; maybe it was to support Ubuntu Lucid Lynx?
Bonus: now you can pause a test with the debugger, inspect its files
in /tmp, and actually see the xattrs along with the data.
Since this patch now uses a real filesystem for testing filesystem
operations, tests are skipped if the underlying filesystem does not
support setting xattrs (eg tmpfs or more than 4k of xattrs on ext4).
References to "/tmp" have been replaced with calls to
tempfile.gettempdir(). This will allow setting the TMPDIR envvar in
test setup and getting an XFS filesystem instead of ext4 or tmpfs.
THIS PATCH SIGNIFICANTLY CHANGES TESTING ENVIRONMENTS
With this patch, every test environment will require TMPDIR to be
using a filesystem that supports at least 4k of extended attributes.
Neither ext4 nor tempfs support this. XFS is recommended.
So why all the SkipTests? Why not simply raise an error? We still need
the tests to run on the base image for OpenStack's CI system. Since
we were previously mocking out xattr, there wasn't a problem, but we
also weren't actually testing anything. This patch adds functionality
to validate xattr data, so we need to drop the mock.
`test.unit.skip_if_no_xattrs()` is also imported into `test.functional`
so that functional tests can import it from the functional test
namespace.
The related OpenStack CI infrastructure changes are made in
https://review.openstack.org/#/c/394600/.
Co-Authored-By: John Dickinson <me@not.mn>
Change-Id: I98a37c0d451f4960b7a12f648e4405c6c6716808
Add a --drop-prefixes flag to swift-account-info, swift-container-info,
and swift-object-info. This makes the output between the three more
consistent.
Change-Id: I98252ff74c4983eaad0a93d9a9fc527c74ffce68
- Verify .ring.gz path exist if ring file is the first argument.
- Code Refactoring:
- swift/cli/info.parse_get_node_args()
- Respective test cases for info.parse_get_node_args()
Closes-Bug: #1539275
Change-Id: I0a41936d6b75c60336be76f8702fd616d74f1545
Signed-off-by: Sachin Patil <psachin@redhat.com>
I changed asserts with more specific assert methods.
e.g.: from assertTrue(sth == None) to assertIsNone(*) or
assertTrue(isinstance(inst, type)) to assertIsInstace(inst, type)
or assertTrue(not sth) to assertFalse(sth).
The code gets more readable, and a better description will be shown on fail.
Change-Id: I39305808ad2349dc11a42261b41dbb347ac0618a
Add unit tests to cover all code paths in print_item_locations
function in cli/info.py.
Update comment to match what's tested for invalid/missing policy.
Update tests to verify output of print_item_locations
Corrected PEP8 compliance violations.
Change-Id: I84958cb70205ee8d7ea246826dd56201fa642da9
I like using the rightmost one more; it's basically
/operator-defined/mountpoint/objects/part/suffix/hash/ts.data, so I
don't see any opportunity for other things named "objects" to creep in on the
right of the real objects-N dir; but I could see some admin using
/srv/object-storage/ or something
-- Torgomatic The Wise
Change-Id: I0a63a3e02df091a5ee2e110a345183012e357a2f
If swift-object-info command is executed at deeper working directory
than 'objects-*' directory, it cannot parse policy index from file path
so it does not show appropriate policy index. This patch fixes this
problem by simply extracting the full path of a target object file.
Change-Id: Idb734106a44b6121119c9b1dc8cdaaf4c6c28c31
Closes-Bug: 1469951
* replace "from cStringIO import StringIO"
with "from six.moves import cStringIO as StringIO"
* replace "from StringIO import StringIO"
with "from six import StringIO"
* replace "import cStringIO" and "cStringIO.StringIO()"
with "from six import moves" and "moves.cStringIO()"
* replace "import StringIO" and "StringIO.StringIO()"
with "import six" and "six.StringIO()"
This patch was generated by the stringio operation of the sixer tool:
https://pypi.python.org/pypi/sixer
Change-Id: Iacba77fec3045f96773d1090c0bd48613729a561
Split out system, user and other metadata in swift-object-info. Print
every position line by line instead of raw dict representation, so it
would be easier to parse with tools such as grep.
Co-Authored-By: Ricardo Ferreira <ricardo.sff@gmail.com>
Co-Authored-By: Kamil Rykowski <kamil.rykowski@intel.com>
Change-Id: Ia78da518c18f7e26016700aee87efb534fbd2040
Closes-Bug: #1428866
Adds a unit test to verify the change made in [1], i.e. that
swift-object-info will read from .meta and .ts files as well
as .data files.
[1] change I43966d371218ad39414e9282cde579e48370a2a7
Change-Id: I82dde36e3a96db1a21cfe9a4cca0d941e543dfd0
Related-Bug: 1425679
According to https://docs.python.org/3/howto/pyporting.html the
syntax changed in Python 3.x. The new syntax is usable with
Python >= 2.6 and should be preferred to be compatible with Python3.
Enabled hacking check H231.
Change-Id: I2c41dc3ec83e79181e8fd50e76771a74c393269c
The normalized form of the X-Timestamp header looks like a float with a fixed
width to ensure stable string sorting - normalized timestamps look like
"1402464677.04188"
To support overwrites of existing data without modifying the original
timestamp but still maintain consistency a second internal offset
vector is append to the normalized timestamp form which compares and
sorts greater than the fixed width float format but less than a newer
timestamp. The internalized format of timestamps looks like
"1402464677.04188_0000000000000000" - the portion after the underscore
is the offset and is a formatted hexadecimal integer.
The internalized form is not exposed to clients in responses from Swift.
Normal client operations will not create a timestamp with an offset.
The Timestamp class in common.utils supports internalized and normalized
formatting of timestamps and also comparison of timestamp values. When the
offset value of a Timestamp is 0 - it's considered insignificant and need not
be represented in the string format; to support backwards compatibility during
a Swift upgrade the internalized and normalized form of a Timestamp with an
insignificant offset are identical. When a timestamp includes an offset it
will always be represented in the internalized form, but is still excluded
from the normalized form. Timestamps with an equivalent timestamp portion
(the float part) will compare and order by their offset. Timestamps with a
greater timestamp portion will always compare and order greater than a
Timestamp with a lesser timestamp regardless of it's offset. String
comparison and ordering is guaranteed for the internalized string format, and
is backwards compatible for normalized timestamps which do not include an
offset.
The reconciler currently uses a offset bump to ensure that objects can move to
the wrong storage policy and be moved back. This use-case is valid because
the content represented by the user-facing timestamp is not modified in way.
Future consumers of the offset vector of timestamps should be mindful of HTTP
semantics of If-Modified and take care to avoid deviation in the response from
the object server without an accompanying change to the user facing timestamp.
DocImpact
Implements: blueprint storage-policies
Change-Id: Id85c960b126ec919a481dc62469bf172b7fb8549
swift-container-info:
Print policy container info
swift-object-info:
Allow to specify storage policy name when looking for object info
Notify if there is missmatch between ring location and the actual
object path in filesystem
swift-get-nodes:
Allow to specify storage policy name when looking for account/
container/object ring location
Notify if there is missmatch between ring and the policy
Lookup policy name in swift.conf; 'Legacy' container will use
policy-0's name; 'Unknown' is shown if policy not found in swift.conf
DocImpact
Implements: blueprint storage-policies
Change-Id: I450d40dc6e2d8f759187dff36d658e52737ae2a5
FakeLogger gets better log level handling
Parameterize logger on some daemons which were previously
unparameterized and try and use the interface in tests.
FakeRing use more real code
The existing FakeRing mock's implementation bit me on some pretty subtle
character encoding issue by-passing the hash_path code that is normally
part of get_part_nodes. This change tries to exercise more of the real
ring code paths when it makes sense and provide a better Fake for use in
testing.
Add write_fake_ring helper to test.unit for when you need a real ring.
DocImpact
Implements: blueprint storage-policies
Change-Id: Id2e3740b1dd569050f4e083617e7dd6a4249027e
Test compares cluster info to hardcoded expected data and wasn't
sorting the two sets of things being compared leading to some
sporadic unit test failures.
Change-Id: I3ef98260a62c15d06ba8cc196196d4e90abca3f0
Currently, 'Container Count' was missing in data base info.
So this patch will help printing 'Container Count' also.
Change-Id: I1ca80ee79e71b086b30fd2d1ab024ea1cfb324f5