56 Commits

Author SHA1 Message Date
Tim Burke
ce9e56a6d1 lint: Consistently use assertIsInstance
This has been available since py32 and was backported to py27; there
is no point in us continuing to carry the old idiom forward.

Change-Id: I21f64b8b2970e2dd5f56836f7f513e7895a5dc88
2024-02-07 15:48:39 -08:00
Romain de Joux
365c0ef005 Encode header in latin-1 with wsgi_to_bytes
Prevent encoding corruption in client's metadata during ssync

Closes-Bug: #2020667
Change-Id: I0ea464bcda16678997865667287aa11ea89cdcde
2023-07-10 15:09:04 +01:00
Tim Burke
4e74e7f558 ssync: Round-trip offsets in meta/ctype Timestamps
Use double-underscore to separate to ensure old code blows up rather
than misinterpret encoded offsets.

Change-Id: Idf9b5118e9b64843e0c4dd7088b498b165f33db4
2023-04-17 13:25:56 +01:00
Alistair Coles
2fe18b24cd ssync: fix decoding of ts_meta when ts_data has offset
The SsyncSender encodes object file timestamps in a compact form and
the SsyncReceiver decodes the timestamps and compares them to its
object file set.

The encoding represents the meta file timestamp as a delta from the
data file timestamp, NOT INCLUDING the data file timestamp offset.

Previously, the decoding was erroneously calculating the meta file
timestamp as the sum of the delta plus the data file timestamp
INCLUDING the offset.

For example, if the SssyncSender has object file timestamps:

  ts_data = t0_1.data
  ts_meta = t1.data

then the receiver would erroneously perceive that the sender has:

  ts_data = t0_1.data
  ts_meta = t1_1.data

As described in the referenced bug report, this erroneous decoding
could cause the SsyncReceiver to request that the SsyncSender sync an
object that is already in sync, which results in a 409 Conflict at the
receiver. The 409 causes the ssync session to terminate, and the same
process repeats on the next attempt.

Closes-Bug: #2007643
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I74a0aac0ac29577026743f87f4b654d85e8fcc80
2023-02-27 07:27:32 -06:00
Alistair Coles
1b3879e0da reconstructor: include partially reverted handoffs in handoffs_remaining
For a reconstructor revert job, if sync'd to sufficient other nodes,
the handoff partition is considered done and handoffs_remaining is not
incremented. With the new max_objects_per_revert option [1], a ssync
job may appear to be complete but not all objects have yet been
reverted, so handoffs remaining should be incremented.

[1] Related-Change: If81760c80a4692212e3774e73af5ce37c02e8aff
Change-Id: I59572f75b9b0ba331369eb7358932943b7935ff0
2021-12-03 14:37:59 +00:00
Alistair Coles
8ee631ccee reconstructor: restrict max objects per revert job
Previously the ssync Sender would attempt to revert all objects in a
partition within a single SSYNC request. With this change the
reconstructor daemon option max_objects_per_revert can be used to limit
the number of objects reverted inside a single SSYNC request for revert
type jobs i.e. when reverting handoff partitions.

If more than max_objects_per_revert are available, the remaining objects
will remain in the sender partition and will not be reverted until the
next call to ssync.Sender, which would currrently be the next time the
reconstructor visits that handoff partition.

Note that the option only applies to handoff revert jobs, not to sync
jobs.

Change-Id: If81760c80a4692212e3774e73af5ce37c02e8aff
2021-12-03 12:43:23 +00:00
Alistair Coles
88eb360f5b ssync sender: add context to missing_check error log
Change-Id: I1280b8fc7b06eacb9f94c75db2e93aeea38ac5a2
2021-11-23 17:03:33 +00:00
Clay Gerrard
2a312d1cd5 Cleanup tests' import of debug_logger
Change-Id: I19ca860deaa6dbf388bdcd1f0b0f77f72ff19689
2021-04-27 12:04:41 +01:00
Alistair Coles
1dceafa7d5 ssync: sync non-durable fragments from handoffs
Previously, ssync would not sync nor cleanup non-durable data
fragments on handoffs. When the reconstructor is syncing objects from
a handoff node (a 'revert' reconstructor job) it may be useful, and is
not harmful, to also send non-durable fragments if the receiver has
older or no fragment data.

Several changes are made to enable this. On the sending side:

  - For handoff (revert) jobs, the reconstructor instantiates
    SsyncSender with a new 'include_non_durable' option.
  - If configured with the include_non_durable option, the SsyncSender
    calls the diskfile yield_hashes function with options that allow
    non-durable fragments to be yielded.
  - The diskfile yield_hashes function is enhanced to include a
    'durable' flag in the data structure yielded for each object.
  - The SsyncSender includes the 'durable' flag in the metadata sent
    during the missing_check exchange with the receiver.
  - If the receiver requests the non-durable object, the SsyncSender
    includes a new 'X-Backend-No-Commit' header when sending the PUT
    subrequest for the object.
  - The SsyncSender includes the non-durable object in the collection
    of synced objects returned to the reconstructor so that the
    non-durable fragment is removed from the handoff node.

On the receiving side:

  - The object server includes a new 'X-Backend-Accept-No-Commit'
    header in its response to SSYNC requests. This indicates to the
    sender that the receiver has been upgraded to understand the
    'X-Backend-No-Commit' header.
  - The SsyncReceiver is enhanced to consider non-durable data when
    determining if the sender's data is wanted or not.
  - The object server PUT method is enhanced to check for and
    'X-Backend-No-Commit' header before committing a diskfile.

If a handoff sender has both a durable and newer non-durable fragment
for the same object and frag-index, only the newer non-durable
fragment will be synced and removed on the first reconstructor
pass. The durable fragment will be synced and removed on the next
reconstructor pass.

Change-Id: I1d47b865e0a621f35d323bbed472a6cfd2a5971b
Closes-Bug: 1778002
2021-01-20 12:00:10 +00:00
Tim Burke
d270596b67 Consistently use io.BytesIO
Change-Id: Ic41b37ac75b5596a8307c4962be86f2a4b0d9731
2019-10-15 15:09:46 +02:00
Tim Burke
d9cafca246 py3: port ssync
Change-Id: I63a502be13f5dcda2a457d38f2fc5f1ca469d562
2019-06-05 13:26:18 -07:00
Clay Gerrard
fb0e7837af Cleanup EC and SSYNC frag index parameters
An object node should reject a PUT with 409 when the timestamp is less
than or equal to the timestamp of an existing version of the object.

However, if the PUT is part of an SSYNC, and the fragment archive has a
different index than the one on disk we may store it.

We should store it we're the primary holder for that fragment index.

Back before the related change we used to revert fragments to handoffs
and it caused a lot of problems.  Mainly multiple frag indexes piling up
on one handoff node.  Eventually we settled on handoffs only reverting
to primaries but there was some crufty flailing left over.

When EC frag duplication (multi-region EC) came in we also added a new
complexity because a node's primary index (the index in part_nodes list)
was no longer universially equal to the EC frag index (the storage
policy backend end index).  There was a few places we assumed
node_index == frag_index, some of which caused bugs which we've fixed.

This change tries to clean all that up.

Related-Change-Id: Ie351d8342fc8e589b143f981e95ce74e70e52784

Change-Id: I3c5935e2d5f1cd140cf52df779596ebd6442686c
2019-02-04 17:02:17 -06:00
Romain LE DISEZ
2d1c438191 SSYNC: stop sharing global available_map/send_map
Change-Id: Iaba8abb81dec792ee92e3715ecc459b57755fcae
2018-10-29 17:39:58 +01:00
Romain LE DISEZ
6b94cf204a SSYNC: Stop sharing a global response
Change-Id: Ia431d20e1132cc139ac067d66d5d1626ec07117f
2018-10-29 17:39:45 +01:00
Romain LE DISEZ
e4ad56abb1 SSYNC: Stop sharing a global connection
Change-Id: Id5988887f01532b27c3888a126764524d2466011
2018-10-29 17:39:31 +01:00
Romain LE DISEZ
46c6fab1cf SSYNC: Remove useless self.failures in sender
Change-Id: Ie5264af0bf2a8d557489c597c3fdc5728e69c6e8
2018-10-29 10:51:38 +01:00
Samuel Merritt
c28004deb0 Multiprocess object replicator
Add a multiprocess mode to the object replicator. Setting the
"replicator_workers" setting to a positive value N will result in the
replicator using up to N worker processes to perform replication
tasks.

At most one worker per disk will be spawned, so one can set
replicator_workers=99999999 to always get one worker per disk
regardless of the number of disks in each node. This is the same
behavior that the object reconstructor has.

Worker process logs will have a bit of information prepended so
operators can tell which messages came from which worker. It looks
like this:

  [worker 1/2 pid=16529] 154/154 (100.00%) partitions replicated in 1.02s (150.87/sec, 0s remaining)

The prefix is "[worker M/N pid=P] ", where M is the worker's index, N
is the total number of workers, and P is the process ID. Every message
from the replicator's logger will have the prefix; this includes
messages from down in diskfile, but does not include things printed to
stdout or stderr.

Drive-by fix: don't dump recon stats when replicating only certain
policies. When running the object replicator with replicator_workers >
0 and "--policies=X,Y,Z", the replicator would update recon stats
after running. Since it only ran on a subset of objects, it should not
update recon, much like it doesn't update recon when run with
--devices or --partitions.

Change-Id: I6802a9ad9f1f9b9dafb99d8b095af0fdbf174dc5
2018-04-24 04:05:08 +00:00
Samuel Merritt
728b4ba140 Add checksum to object extended attributes
Currently, our integrity checking for objects is pretty weak when it
comes to object metadata. If the extended attributes on a .data or
.meta file get corrupted in such a way that we can still unpickle it,
we don't have anything that detects that.

This could be especially bad with encrypted etags; if the encrypted
etag (X-Object-Sysmeta-Crypto-Etag or whatever it is) gets some bits
flipped, then we'll cheerfully decrypt the cipherjunk into plainjunk,
then send it to the client. Net effect is that the client sees a GET
response with an ETag that doesn't match the MD5 of the object *and*
Swift has no way of detecting and quarantining this object.

Note that, with an unencrypted object, if the ETag metadatum gets
mangled, then the object will be quarantined by the object server or
auditor, whichever notices first.

As part of this commit, I also ripped out some mocking of
getxattr/setxattr in tests. It appears to be there to allow unit tests
to run on systems where /tmp doesn't support xattrs. However, since
the mock is keyed off of inode number and inode numbers get re-used,
there's lots of leakage between different test runs. On a real FS,
unlinking a file and then creating a new one of the same name will
also reset the xattrs; this isn't the case with the mock.

The mock was pretty old; Ubuntu 12.04 and up all support xattrs in
/tmp, and recent Red Hat / CentOS releases do too. The xattr mock was
added in 2011; maybe it was to support Ubuntu Lucid Lynx?

Bonus: now you can pause a test with the debugger, inspect its files
in /tmp, and actually see the xattrs along with the data.

Since this patch now uses a real filesystem for testing filesystem
operations, tests are skipped if the underlying filesystem does not
support setting xattrs (eg tmpfs or more than 4k of xattrs on ext4).

References to "/tmp" have been replaced with calls to
tempfile.gettempdir(). This will allow setting the TMPDIR envvar in
test setup and getting an XFS filesystem instead of ext4 or tmpfs.

THIS PATCH SIGNIFICANTLY CHANGES TESTING ENVIRONMENTS

With this patch, every test environment will require TMPDIR to be
using a filesystem that supports at least 4k of extended attributes.
Neither ext4 nor tempfs support this. XFS is recommended.

So why all the SkipTests? Why not simply raise an error? We still need
the tests to run on the base image for OpenStack's CI system. Since
we were previously mocking out xattr, there wasn't a problem, but we
also weren't actually testing anything. This patch adds functionality
to validate xattr data, so we need to drop the mock.

`test.unit.skip_if_no_xattrs()` is also imported into `test.functional`
so that functional tests can import it from the functional test
namespace.

The related OpenStack CI infrastructure changes are made in
https://review.openstack.org/#/c/394600/.

Co-Authored-By: John Dickinson <me@not.mn>

Change-Id: I98a37c0d451f4960b7a12f648e4405c6c6716808
2017-11-03 13:30:05 -04:00
Romain LE DISEZ
091157fc7f Fix encoding issue in ssync_sender.send_put()
EC object metadata can currently have a mixture of bytestrings and
unicode.  The ssync_sender.send_put() method raises an
UnicodeDecodeError when it attempts to concatenate the metadata
values, if any bytestring has non-ascii characters.

The root cause of this issue is that the object server uses unicode
for the keys of some object metadata items that are received in the
footer of an EC PUT request, whereas all other object metadata keys
and values are persisted as bytestrings.

This patch fixes the bug by changing diskfile write_metadata()
function to encode all unicode metadata keys and values as utf8
encoded bytes before writing to disk. To cope with existing objects
that have a mixture of unicode and bytestring metadata, the diskfile
read_metadata() function is also changed so that all returned unicode
metadata keys and values are utf8 encoded. This ensures that
ssync_sender.send_put() (and any other caller of diskfile
read_metadata) only reads bytestrings from object metadata.

Closes-Bug: #1678018
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: Ic23c55754ee142f6f5388dcda592a3afc9845c39
2017-04-19 18:05:52 +01:00
Alexandre Lécuyer
cff7455a68 Remove unused returned value object_path from yield_hashes()
Some public functions in the diskfile manager expect or return full
file paths. It implies a filesystem diskfile implementation.
To make it easier to plug alternate diskfile implementations, patch
functions to take more generic arguments.

This commit changes DiskFileManager yield_hashes() returned values
from:
        - object_path, object_hash, timestamps
to:
        - object_hash, timestamps

object_path was not used by any caller.

Change-Id: I914fb1ec8ce7c9c26d22e1d07f03bd03f4504176
2017-03-27 17:18:28 +02:00
Kota Tsuyuzaki
b09360d447 Fix stats calculation in object-reconstructor
This patch fixes the object-reconstructor to calculate device_count
as the total number of local devices in all policies. Previously
Swift counts it for each policy but reconstruction_device_count
which means the number of devices actually swift needs to reconstruct
is counted as sum of ones for all polices.

With this patch, Swift will gather all local devices for all policies
at first, and then, collect parts for each devices as well as current.
To do so, we can see the statuses for remaining job/disks percentage via
stats_line output.

To enable this change, this patch also touchs the object replicator
to get a DiskFileManager via the DiskFileRouter class so that
DiskFileManager instances are policy specific. Currently the same
replication policy DiskFileManager class is always used, but this
change future proofs the replicator for possible other DiskFileManager
implementations.

The change also gives the ObjectReplicator a _df_router variable,
making it consistent with the ObjectReconstructor, and allowing a
common way for ssync.Sender to access DiskFileManager instances via
it's daemon's _df_router instance.

Also, remove the use of FakeReplicator from the ssync test suite. It
was not necessary and risked masking divergence between ssync and the
replicator and reconstructor daemon implementations.

Co-Author: Alistair Coles <alistair.coles@hpe.com>

Closes-Bug: #1488608
Change-Id: Ic7a4c932b59158d21a5fb4de9ed3ed57f249d068
2016-12-12 21:26:54 -08:00
Alistair Coles
3218f8b064 Prevent ssync writing bad fragment data to diskfile
Previously, if a reconstructor sync type job failed to provide
sufficient bytes from a reconstructed fragment body iterator to match
the content-length that the ssync sender had already sent to the ssync
receiver, the sender would still proceed to send the next
subrequest. The ssync receiver might then write the start of the next
subrequest to the partially complete diskfile for the previous
subrequest (including writing subrequest headers to that diskfile)
until it has received content-length bytes.

Since a reconstructor ssync job does not send an ETag header (it
cannot because it does not know the ETag of a reconstructed fragment
until it has been sent) then the receiving object server does not
detect the "bad" data written to the fragment diskfile, and worse,
will label it with an ETag that matches the md5 sum of the bad
data. The bad fragment file will therefore appear good to the auditor.

There is no easy way for the ssync sender to communicate a lack of
source data to the receiver other than by disconnecting the
session. So this patch adds a check in the ssync sender that the sent
byte count is equal to the sent Content-Length header value for each
subrequest, and disconnect if a mismatch is detected.

The disconnect prevents the receiver finalizing the bad diskfile, but
also prevents subsequent fragments in the ssync job being sync'd until
the next cycle.

Closes-Bug: #1631144
Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>

Change-Id: I54068906efdb9cd58fcdc6eae7c2163ea92afb9d
2016-10-13 17:15:10 +01:00
Alistair Coles
e91de49d68 Update container on fast-POST
This patch makes a number of changes to enable content-type
metadata to be updated when using the fast-POST mode of
operation, as proposed in the associated spec [1].

* the object server and diskfile are modified to allow
  content-type to be updated by a POST and the updated value
  to be stored in .meta files.

* the object server accepts PUTs and DELETEs with older
  timestamps than existing .meta files. This is to be
  consistent with replication that will leave a later .meta
  file in place when replicating a .data file.

* the diskfile interface is modified to provide accessor
  methods for the content-type and its timestamp.

* the naming of .meta files is modified to encode two
  timestamps when the .meta file contains a content-type value
  that was set prior to the latest metadata update; this
  enables consistency to be achieved when rsync is used for
  replication.

* ssync is modified to sync meta files when content-type
  differs between local and remote copies of objects.

* the object server issues container updates when handling
  POST requests, notifying the container server of the current
  immutable metadata (etag, size, hash, swift_bytes),
  content-type with their respective timestamps, and the
  mutable metadata timestamp.

* the container server maintains the most recently reported
  values for immutable metadata, content-type and mutable
  metadata, each with their respective timestamps, in a single
  db row.

* new probe tests verify that replication achieves eventual
  consistency of containers and objects after discrete updates
  to content-type and mutable metadata, and that container-sync
  sync's objects after fast-post updates.

[1] spec change-id: I60688efc3df692d3a39557114dca8c5490f7837e

Change-Id: Ia597cd460bb5fd40aa92e886e3e18a7542603d01
2016-03-03 14:25:10 +00:00
Alistair Coles
60b2e02905 Make ECDiskFile report all fragments found on disk
Refactor the disk file get_ondisk_files logic to enable
ECDiskfile to gather *all* fragments found on disk (not just those
with a matching .durable file) and make the fragments available
via the DiskFile interface as a dict mapping:

    Timestamp --> list of fragment indexes

Also, if a durable fragment has been found then the timestamp
of the durable file is exposed via the diskfile interface.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I55e20a999685b94023d47b231d51007045ac920e
2015-12-15 15:25:11 +00:00
Alistair Coles
6858510b59 Re-organise ssync tests
We have some tests that exercise both the sender and receiver,
but are spread across test_ssync_sender.py and test_ssync_receiver.py.
This creates a new module test_ssync.py and moves the end-to-end tests
into there.

Change-Id: Iea3e9932734924453f7241432afda90abbc75c06
2015-11-05 14:50:28 +00:00
Victor Stinner
8f85427939 py3: Replace gen.next() with next(gen)
The next() method of Python 2 generators was renamed to __next__().
Call the builtin next() function instead which works on Python 2 and
Python 3.

The patch was generated by the next operation of the sixer tool.

Change-Id: Id12bc16cba7d9b8a283af0d392188a185abe439d
2015-10-08 15:40:06 +02:00
Victor Stinner
c0af385173 py3: Replace urllib imports with six.moves.urllib
The urllib, urllib2 and urlparse modules of Python 2 were reorganized
into a new urllib namespace on Python 3. Replace urllib, urllib2 and
urlparse imports with six.moves.urllib to make the modified code
compatible with Python 2 and Python 3.

The initial patch was generated by the urllib operation of the sixer
tool on: bin/* swift/ test/.

Change-Id: I61a8c7fb7972eabc7da8dad3b3d34bceee5c5d93
2015-10-08 15:24:13 +02:00
Alistair Coles
29c10db0cb Add POST capability to ssync for .meta files
ssync currently does the wrong thing when replicating object dirs
containing both a .data and a .meta file. The ssync sender uses a
single PUT to send both object content and metadata to the receiver,
using the metadata (.meta file) timestamp. This results in the object
content timestamp being advanced to the metadata timestamp,
potentially overwriting newer object data on the receiver and causing
an inconsistency with the container server record for the object.

For example, replicating an object dir with {t0.data(etag=x), t2.meta}
to a receiver with t1.data(etag=y) will result in the creation of
t2.data(etag=x) on the receiver. However, the container server will
continue to list the object as t1(etag=y).

This patch modifies ssync to replicate the content of .data and .meta
separately using a PUT request for the data (no change) and a POST
request for the metadata. In effect, ssync replication replicates the
client operations that generated the .data and .meta files so that
the result of replication is the same as if the original client requests
had persisted on all object servers.

Apart from maintaining correct timestamps across sync'd nodes, this has
the added benefit of not needing to PUT objects when only the metadata
has changed and a POST will suffice.

Taking the same example, ssync sender will no longer PUT t0.data but will
POST t2.meta resulting in the receiver having t1.data and t2.meta.

The changes are backwards compatible: an upgraded sender will only sync
data files to a legacy receiver and will not sync meta files (fixing the
erroneous behavior described above); a legacy sender will operate as
before when sync'ing to an upgraded receiver.

Changes:
- diskfile API provides methods to get the data file timestamp
  as distinct from the diskfile timestamp.

- diskfile yield_hashes return tuple now passes a dict mapping data and
  meta (if any) timestamps to their respective values in the timestamp
  field.

- ssync_sender will encode data and meta timestamps in the
  (hash_path, timestamp) tuple sent to the receiver during
  missing_checks.

- ssync_receiver compares sender's data and meta timestamps to any
  local diskfile and may specify that only data or meta parts are sent
  during updates phase by appending a qualifier to the hash returned
  in its 'wanted' list.

- ssync_sender now sends POST subrequests when a meta file
  exists and its content needs to be replicated.

- ssync_sender may send *only* a POST if the receiver indicates that
  is the only part required to be sync'd.

- object server will allow PUT and DELETE with earlier timestamp than
  a POST

- Fixed TODO related to replicated objects with fast-POST and ssync

Related spec change-id: I60688efc3df692d3a39557114dca8c5490f7837e

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Closes-Bug: 1501528
Change-Id: I97552d194e5cc342b0a3f4b9800de8aa6b9cb85b
2015-10-02 11:24:19 +00:00
paul luse
a3facce53c Fix invalid frag_index header in ssync_sender when reverting EC tombstones
Back in d124ce [1] we failed to recognize the situation where a revert
job would have an explicit frag_index key wth the literal value None
which would take precedence over the dict.get's default value of ''.
Later in ssync_receiver we'd bump into the ValueError converting 'None'
to an int (again).

In ssync_sender we now handle literal None's correctly and should
hopefully no longer put this invalid headers on the wire - but for belts
and braces we'll also update ssync_receiver to raise a 400 series error
and ssync_sender to better log the error messages.

1. https://review.openstack.org/#/c/195457/

Co-Author: Clay Gerrard <clay.gerrard@gmail.com>
Co-Author: Alistair Coles <alistair.coles@hp.com>
Change-Id: Ic71ba7cc82487773214030207bb193f425319449
Closes-Bug: 1489546
2015-09-08 14:58:11 -07:00
Clay Gerrard
05de1305a9 Make ssync_sender send valid chunked requests
The connect method of ssync_sender tells the remote connection that it's
going to send a valid HTTP chunked request, but if the remote end needs
to respond with an error of any kind sender throws HTTP right out the
window, picks up his ball, and closes the socket down hard - much to the
surprise of the eventlet.wsgi server who up to this point had been
playing along quite nicely with this 'SSYNC' nonsense assuming that
everyone here is consenting mature adults.

If you're going to make a "Transfer-Encoding: chunked" request have the
good decency to finish the job with a proper '0\r\n\r\n'. [1]

N.B. It might be possible to handle an error status during the
initialize_request phase with some sort of 100-continue support, but
honestly it's not entirely clear to me when the server isn't going to
close the connection if the client is still expected to send the body
[2] - further if the error comes later during missing_check or updates
we'll for sure want to send the chunk transfer termination line before
we close down the socket and this way we cover both.

1. Really, eventlet.wsgi shouldn't be so blasted brittle about this [3]
2. https://lists.w3.org/Archives/Public/ietf-http-wg/2005AprJun/0007.html
3. c3ce3eef0b

Closes-Bug #1489587
Change-Id: Ic17c6c3075553f8cf6ef6213e62a00282f0d01cf
2015-08-28 11:38:05 -07:00
janonymous
9456af35a2 pep8 fix: assertEquals -> assertEqual
assertEquals is deprecated in py3,changes in
dir:
*test/unit/obj/*
*test/unit/test_locale/*

Change-Id: I3dd0c1107165ac529f1cd967363e5cf408a1d02b
2015-08-07 19:28:35 +05:30
Alistair Coles
6f89f71f9b Filter Etag key from ssync replication-headers
ssync rx sends a header X-Backend-Replication-Headers whose value is a
list of headers that the source object has. This list extends the list
of allowed headers on the target object server, so that the target
object metadata is faithfully reconstructed to match the source.

Unfortunately the combination of lower() and title() operations on
header keys results in the source 'ETag' value being added to the target
metadata under the key 'Etag' in addition to the 'ETag' key that the
receiving server adds (note different capitilization), both having
the same value.

The spurious 'Etag' metadata is potentially confusing for humans
inspecting the object metadata and complicates tests that wish to
assert the equality of two object metadata dicts. See for example the
test in test_ssync_sender.py that this patch cleans up.

Furthermore, the possibility of having both Etag and ETag keys has
required a workaround in the EC reconstructor [1].

[1] reconstructor fix change id: Ie59ad93a67a7f439c9a84cd9cff31540f97f334a

Change-Id: I0c89cf7924a4471bb6d268b5ef3884e2d2cb4286
2015-07-24 21:51:10 -07:00
Jenkins
260e976e50 Merge "Get StringIO and cStringIO from six.moves" 2015-07-24 06:52:36 +00:00
janonymous
cd7b2db550 unit tests: Replace "self.assert_" by "self.assertTrue"
The assert_() method is deprecated and can be safely replaced by assertTrue().
This patch makes sure that running the tests does not create undesired
warnings.

Change-Id: I0602ba39ef93263386644ee68088d5f65fcb4a71
2015-07-21 19:23:00 +05:30
Victor Stinner
6e70f3fa32 Get StringIO and cStringIO from six.moves
* replace "from cStringIO import StringIO"
  with "from six.moves import cStringIO as StringIO"
* replace "from StringIO import StringIO"
  with "from six import StringIO"
* replace "import cStringIO" and "cStringIO.StringIO()"
  with "from six import moves" and "moves.cStringIO()"
* replace "import StringIO" and "StringIO.StringIO()"
  with "import six" and "six.StringIO()"

This patch was generated by the stringio operation of the sixer tool:
https://pypi.python.org/pypi/sixer

Change-Id: Iacba77fec3045f96773d1090c0bd48613729a561
2015-07-15 16:56:33 +02:00
Jenkins
b2e79357bb Merge "Replace dict.iteritems() with dict.items()" 2015-07-09 18:36:05 +00:00
Clay Gerrard
c95a0efe79 Make ssync_sender a better HTTP client
When a server responses with an error - if that error includes a body - the
client should read the body.  This cleans up some ugly eventlet/wsgi.server log
output related to chunked transfer disconnect (invalid literal for int() with
base 16).

Change-Id: Ibd06ddee9f216fce07fa33c3a7d8306b59eb6d77
Closes-Bug: #1466138
2015-06-29 11:38:01 +10:00
Clay Gerrard
d124ce5792 Fix ValueError in ssync_receiver
httplib's putheader method will cast whatever you give it to a string.
where we allow the default dict.get default of None to be passed to
putheader unmodified ssync_receiver is surpised that the non-empty
string isn't able to be converted to an integer.

We can avoid surprising the ssync_receiver in this way by sending the
empty string as a better default.

Change-Id: Ie9df9927ff4d3dd3f334647f883b2937d0d81030
2015-06-26 12:49:26 -07:00
Victor Stinner
e70b66586e Replace dict.iteritems() with dict.items()
The iteritems() of Python 2 dictionaries has been renamed to items() on
Python 3. According to a discussion on the openstack-dev mailing list,
the overhead of creating a temporary list using dict.items() on Python 2
is very low because most dictionaries are small:

http://lists.openstack.org/pipermail/openstack-dev/2015-June/066391.html

Patch generated by the following command:

    sed -i 's,iteritems,items,g' \
      $(find swift -name "*.py") \
      $(find test -name "*.py")

Change-Id: I6070bb6c684be76e8e77222a7d280ec6edd43496
2015-06-24 09:39:55 +02:00
Jenkins
66db3bc2ce Merge "EC Ssync: Update parms to include node and frag indices" 2015-06-23 05:32:42 +00:00
paul luse
ac8a769585 EC Ssync: Update parms to include node and frag indices
Previously we sent the ssync backend frag index based on the node
index.  We need to be more specific for ssync to handle both sync
and revert cases so now we send the frag index based on the job
contents (as determined by the ec recon)) and the node index
as a new header based on, well, the node index.

The rcvr can now validate the incoming pair to reject (400) when
a primary node is being asked to accept fragments that don't
belong to it.  Additionally, by having the frag index the
rcvr can reject (409) an attempt to accept a fragment when its
a handoff and already has one that needs to be reverted.

Fixes-bug: #1452619
Change-Id: I8287b274bbbd00903c1975fe49375590af697be4
2015-06-19 16:30:11 -07:00
janonymous
09e7477a39 Replace it.next() with next(it) for py3 compat
The Python 2 next() method of iterators was renamed to __next__() on
Python 3. Use the builtin next() function instead which works on Python
2 and Python 3.

Change-Id: Ic948bc574b58f1d28c5c58e3985906dee17fa51d
2015-06-15 22:10:45 +05:30
Alistair Coles
191f2a00bd Remove _ensure_flush() from SSYNC receiver
The Receiver._ensure_flush() method in ssync_receiver.py has
the following comment:

    Sends a blank line sufficient to flush buffers.

    This is to ensure Eventlet versions that don't support
    eventlet.minimum_write_chunk_size will send any previous data
    buffered.

    If https://bitbucket.org/eventlet/eventlet/pull-request/37
    ever gets released in an Eventlet version, we should make
    this yield only for versions older than that.

The reference pull request was included with eventlet 0.14 [1] and
swift now requires >=0.16.1 so it is safe to remove _ensure_flush()
and save > 8k bytes per SSYNC response.

[1] 4bd654205a

Change-Id: I367e9a6e92b7ea75fe7e5795cded212657de57ed
2015-05-27 15:01:43 +01:00
Alistair Coles
3aa06f185a Make SSYNC receiver return a reponse when initial checks fail
The ssync Receiver performs some checks on request parameters
in initialize_request() before starting the exchange of missing
hashes and updates e.g. the destination device must be available;
the policy must be valid. Currently if any of these checks fails
then the receiver just closes the connection, so the Sender gets
no useful response code and noise is generated in logs by httplib
and wsgi Exceptions.

This change moves the request parameter checks to the Receiver
constructor so that the HTTPExceptions raised are actually sent
as responses. (The 'connection close' exception handling still
applies once the 'missing_check' and 'updates' handshakes are in
progress.)

Moving initialize_request() revealed the following lurking bug:
 * initialize_request() sets
       req.environ['eventlet.minimum_write_chunk_size'] = 0
 * this was previously ineffective because the Response environ
   had already been copied from Request environ before this value
   was set, so the Response never used the value :/
 * Now that it is effective (a good thing) it causes the empty string
   yielded by the receiver when there are no missing hashes in
   missing_checks() to be sent to the sender immediately. This makes
   the Sender.readline() think there has been an early disconnect
   and raise an Exception (a bad thing), as revealed by
   test/unit/obj/test_ssync_sender.py:TestSsync.test_nothing_to_sync

The fix for this is to simply make the receiver skip sending the empty
string if there are no missing object_hashes.

Change-Id: I036a6919fead6e970505dccbb0da7bfbdf8cecc3
2015-05-27 15:01:11 +01:00
Alistair Coles
98b725fec6 Cleanup and extend end to end ssync tests
Extends the existing end to end ssync tests with
a test using replication policy.

Also some cleanup and improvements to the test framework e.g. rather
than faking the connection between sender and receiver, use a real
connection and wrap it to capture traffic for verification.

Change-Id: Id71d2eb3fb8fa15c016ef151aacf95f97196a902
2015-05-13 11:05:13 +01:00
paul luse
647b66a2ce Erasure Code Reconstructor
This patch adds the erasure code reconstructor. It follows the
design of the replicator but:
  - There is no notion of update() or update_deleted().
  - There is a single job processor
  - Jobs are processed partition by partition.
  - At the end of processing a rebalanced or handoff partition, the
    reconstructor will remove successfully reverted objects if any.

And various ssync changes such as the addition of reconstruct_fa()
function called from ssync_sender which performs the actual
reconstruction while sending the object to the receiver

Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: John Dickinson <me@not.mn>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
blueprint ec-reconstructor
Change-Id: I7d15620dc66ee646b223bb9fff700796cd6bef51
2015-04-14 00:52:17 -07:00
Alistair Coles
fa89064933 Per-policy DiskFile classes
Adds specific disk file classes for EC policy types.

The new ECDiskFile and ECDiskFileWriter classes are used by the
ECDiskFileManager.

ECDiskFileManager is registered with the DiskFileRouter for use with
EC_POLICY type policies.

Refactors diskfile tests into BaseDiskFileMixin and BaseDiskFileManagerMixin
classes which are then extended in subclasses for the legacy
replication-type DiskFile* and ECDiskFile* classes.

Refactor to prefer use of a policy instance reference over a policy_index
int to refer to a policy.

Add additional verification to DiskFileManager.get_dev_path to validate the
device root with common.constraints.check_dir, even when mount_check is
disabled for use in on a virtual swift-all-in-one.

Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: John Dickinson <me@not.mn>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Paul Luse <paul.e.luse@intel.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
Change-Id: I22f915160dc67a9e18f4738c1ddf068344e8ad5d
2015-04-14 00:52:16 -07:00
Jenkins
28c99763e9 Merge "Fix ssync send_delete" 2015-02-13 00:18:38 +00:00
Alistair Coles
82e5090848 Fix ssync send_delete
The ssync_sender send_delete method treats its
timestamp argument as a string when in fact it is
passed a Timestamp object. As a result the method
always raises an exception and deletes are never
replicated.

This patch fixes bug and adds unit and probe tests
to verify expected behavior.

Closes-Bug: 1421425

Change-Id: I664fb8d5dfea7362313037a67927ea90021c3f62
2015-02-12 21:44:36 +00:00
Kota Tsuyuzaki
20ca279d74 Efficient Replication for Distributed Regions
This change provides a efficient way of replication
between regions of a global distributed cluster.

This approach makes object-replicator to push replicas
to a primary node in a remote region, then, to skip
pushing them to next primary node in the region with
expecting asynchronous replication.

This implementation includes a couple of changes on
ssync_sender to allow object-replicator to delete local
handoff objects correctly. One is to return a list of existing
objects in remote region. The list includes local paths of the
objects which exist both on the local device and the remote device.
The other is supporting existence check for specified objects.
It requires the object list build by the first change. When
the object list is given, ssync_sender does only missing_check
based on the list. These changes are needed because current
swift can not handle the existence check in object-level.

Note that this feature will work partially (i.e. only when
primary-to-primary) with rsync.

Implements: blueprint efficient-replication
Change-Id: I5d990444d7977f4127bb37f9256212c893438df1
2015-02-10 12:52:15 -08:00