swift

Author	SHA1	Message	Date
Alistair Coles	60b2e02905	Make ECDiskFile report all fragments found on disk Refactor the disk file get_ondisk_files logic to enable ECDiskfile to gather all fragments found on disk (not just those with a matching .durable file) and make the fragments available via the DiskFile interface as a dict mapping: Timestamp --> list of fragment indexes Also, if a durable fragment has been found then the timestamp of the durable file is exposed via the diskfile interface. Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Change-Id: I55e20a999685b94023d47b231d51007045ac920e	2015-12-15 15:25:11 +00:00
Alistair Coles	6858510b59	Re-organise ssync tests We have some tests that exercise both the sender and receiver, but are spread across test_ssync_sender.py and test_ssync_receiver.py. This creates a new module test_ssync.py and moves the end-to-end tests into there. Change-Id: Iea3e9932734924453f7241432afda90abbc75c06	2015-11-05 14:50:28 +00:00
Victor Stinner	8f85427939	py3: Replace gen.next() with next(gen) The next() method of Python 2 generators was renamed to __next__(). Call the builtin next() function instead which works on Python 2 and Python 3. The patch was generated by the next operation of the sixer tool. Change-Id: Id12bc16cba7d9b8a283af0d392188a185abe439d	2015-10-08 15:40:06 +02:00
Victor Stinner	c0af385173	py3: Replace urllib imports with six.moves.urllib The urllib, urllib2 and urlparse modules of Python 2 were reorganized into a new urllib namespace on Python 3. Replace urllib, urllib2 and urlparse imports with six.moves.urllib to make the modified code compatible with Python 2 and Python 3. The initial patch was generated by the urllib operation of the sixer tool on: bin/* swift/ test/. Change-Id: I61a8c7fb7972eabc7da8dad3b3d34bceee5c5d93	2015-10-08 15:24:13 +02:00
Alistair Coles	29c10db0cb	Add POST capability to ssync for .meta files ssync currently does the wrong thing when replicating object dirs containing both a .data and a .meta file. The ssync sender uses a single PUT to send both object content and metadata to the receiver, using the metadata (.meta file) timestamp. This results in the object content timestamp being advanced to the metadata timestamp, potentially overwriting newer object data on the receiver and causing an inconsistency with the container server record for the object. For example, replicating an object dir with {t0.data(etag=x), t2.meta} to a receiver with t1.data(etag=y) will result in the creation of t2.data(etag=x) on the receiver. However, the container server will continue to list the object as t1(etag=y). This patch modifies ssync to replicate the content of .data and .meta separately using a PUT request for the data (no change) and a POST request for the metadata. In effect, ssync replication replicates the client operations that generated the .data and .meta files so that the result of replication is the same as if the original client requests had persisted on all object servers. Apart from maintaining correct timestamps across sync'd nodes, this has the added benefit of not needing to PUT objects when only the metadata has changed and a POST will suffice. Taking the same example, ssync sender will no longer PUT t0.data but will POST t2.meta resulting in the receiver having t1.data and t2.meta. The changes are backwards compatible: an upgraded sender will only sync data files to a legacy receiver and will not sync meta files (fixing the erroneous behavior described above); a legacy sender will operate as before when sync'ing to an upgraded receiver. Changes: - diskfile API provides methods to get the data file timestamp as distinct from the diskfile timestamp. - diskfile yield_hashes return tuple now passes a dict mapping data and meta (if any) timestamps to their respective values in the timestamp field. - ssync_sender will encode data and meta timestamps in the (hash_path, timestamp) tuple sent to the receiver during missing_checks. - ssync_receiver compares sender's data and meta timestamps to any local diskfile and may specify that only data or meta parts are sent during updates phase by appending a qualifier to the hash returned in its 'wanted' list. - ssync_sender now sends POST subrequests when a meta file exists and its content needs to be replicated. - ssync_sender may send only a POST if the receiver indicates that is the only part required to be sync'd. - object server will allow PUT and DELETE with earlier timestamp than a POST - Fixed TODO related to replicated objects with fast-POST and ssync Related spec change-id: I60688efc3df692d3a39557114dca8c5490f7837e Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Closes-Bug: 1501528 Change-Id: I97552d194e5cc342b0a3f4b9800de8aa6b9cb85b	2015-10-02 11:24:19 +00:00
paul luse	a3facce53c	Fix invalid frag_index header in ssync_sender when reverting EC tombstones Back in d124ce [1] we failed to recognize the situation where a revert job would have an explicit frag_index key wth the literal value None which would take precedence over the dict.get's default value of ''. Later in ssync_receiver we'd bump into the ValueError converting 'None' to an int (again). In ssync_sender we now handle literal None's correctly and should hopefully no longer put this invalid headers on the wire - but for belts and braces we'll also update ssync_receiver to raise a 400 series error and ssync_sender to better log the error messages. 1. https://review.openstack.org/#/c/195457/ Co-Author: Clay Gerrard <clay.gerrard@gmail.com> Co-Author: Alistair Coles <alistair.coles@hp.com> Change-Id: Ic71ba7cc82487773214030207bb193f425319449 Closes-Bug: 1489546	2015-09-08 14:58:11 -07:00
Clay Gerrard	05de1305a9	Make ssync_sender send valid chunked requests The connect method of ssync_sender tells the remote connection that it's going to send a valid HTTP chunked request, but if the remote end needs to respond with an error of any kind sender throws HTTP right out the window, picks up his ball, and closes the socket down hard - much to the surprise of the eventlet.wsgi server who up to this point had been playing along quite nicely with this 'SSYNC' nonsense assuming that everyone here is consenting mature adults. If you're going to make a "Transfer-Encoding: chunked" request have the good decency to finish the job with a proper '0\r\n\r\n'. [1] N.B. It might be possible to handle an error status during the initialize_request phase with some sort of 100-continue support, but honestly it's not entirely clear to me when the server isn't going to close the connection if the client is still expected to send the body [2] - further if the error comes later during missing_check or updates we'll for sure want to send the chunk transfer termination line before we close down the socket and this way we cover both. 1. Really, eventlet.wsgi shouldn't be so blasted brittle about this [3] 2. https://lists.w3.org/Archives/Public/ietf-http-wg/2005AprJun/0007.html 3. `c3ce3eef0b` Closes-Bug #1489587 Change-Id: Ic17c6c3075553f8cf6ef6213e62a00282f0d01cf	2015-08-28 11:38:05 -07:00
janonymous	9456af35a2	pep8 fix: assertEquals -> assertEqual assertEquals is deprecated in py3,changes in dir: test/unit/obj/ test/unit/test_locale/ Change-Id: I3dd0c1107165ac529f1cd967363e5cf408a1d02b	2015-08-07 19:28:35 +05:30
Alistair Coles	6f89f71f9b	Filter Etag key from ssync replication-headers ssync rx sends a header X-Backend-Replication-Headers whose value is a list of headers that the source object has. This list extends the list of allowed headers on the target object server, so that the target object metadata is faithfully reconstructed to match the source. Unfortunately the combination of lower() and title() operations on header keys results in the source 'ETag' value being added to the target metadata under the key 'Etag' in addition to the 'ETag' key that the receiving server adds (note different capitilization), both having the same value. The spurious 'Etag' metadata is potentially confusing for humans inspecting the object metadata and complicates tests that wish to assert the equality of two object metadata dicts. See for example the test in test_ssync_sender.py that this patch cleans up. Furthermore, the possibility of having both Etag and ETag keys has required a workaround in the EC reconstructor [1]. [1] reconstructor fix change id: Ie59ad93a67a7f439c9a84cd9cff31540f97f334a Change-Id: I0c89cf7924a4471bb6d268b5ef3884e2d2cb4286	2015-07-24 21:51:10 -07:00
Jenkins	260e976e50	Merge "Get StringIO and cStringIO from six.moves"	2015-07-24 06:52:36 +00:00
janonymous	cd7b2db550	unit tests: Replace "self.assert_" by "self.assertTrue" The assert_() method is deprecated and can be safely replaced by assertTrue(). This patch makes sure that running the tests does not create undesired warnings. Change-Id: I0602ba39ef93263386644ee68088d5f65fcb4a71	2015-07-21 19:23:00 +05:30
Victor Stinner	6e70f3fa32	Get StringIO and cStringIO from six.moves * replace "from cStringIO import StringIO" with "from six.moves import cStringIO as StringIO" * replace "from StringIO import StringIO" with "from six import StringIO" * replace "import cStringIO" and "cStringIO.StringIO()" with "from six import moves" and "moves.cStringIO()" * replace "import StringIO" and "StringIO.StringIO()" with "import six" and "six.StringIO()" This patch was generated by the stringio operation of the sixer tool: https://pypi.python.org/pypi/sixer Change-Id: Iacba77fec3045f96773d1090c0bd48613729a561	2015-07-15 16:56:33 +02:00
Jenkins	b2e79357bb	Merge "Replace dict.iteritems() with dict.items()"	2015-07-09 18:36:05 +00:00
Clay Gerrard	c95a0efe79	Make ssync_sender a better HTTP client When a server responses with an error - if that error includes a body - the client should read the body. This cleans up some ugly eventlet/wsgi.server log output related to chunked transfer disconnect (invalid literal for int() with base 16). Change-Id: Ibd06ddee9f216fce07fa33c3a7d8306b59eb6d77 Closes-Bug: #1466138	2015-06-29 11:38:01 +10:00
Clay Gerrard	d124ce5792	Fix ValueError in ssync_receiver httplib's putheader method will cast whatever you give it to a string. where we allow the default dict.get default of None to be passed to putheader unmodified ssync_receiver is surpised that the non-empty string isn't able to be converted to an integer. We can avoid surprising the ssync_receiver in this way by sending the empty string as a better default. Change-Id: Ie9df9927ff4d3dd3f334647f883b2937d0d81030	2015-06-26 12:49:26 -07:00
Victor Stinner	e70b66586e	Replace dict.iteritems() with dict.items() The iteritems() of Python 2 dictionaries has been renamed to items() on Python 3. According to a discussion on the openstack-dev mailing list, the overhead of creating a temporary list using dict.items() on Python 2 is very low because most dictionaries are small: http://lists.openstack.org/pipermail/openstack-dev/2015-June/066391.html Patch generated by the following command: sed -i 's,iteritems,items,g' \ $(find swift -name ".py") \ $(find test -name ".py") Change-Id: I6070bb6c684be76e8e77222a7d280ec6edd43496	2015-06-24 09:39:55 +02:00
Jenkins	66db3bc2ce	Merge "EC Ssync: Update parms to include node and frag indices"	2015-06-23 05:32:42 +00:00
paul luse	ac8a769585	EC Ssync: Update parms to include node and frag indices Previously we sent the ssync backend frag index based on the node index. We need to be more specific for ssync to handle both sync and revert cases so now we send the frag index based on the job contents (as determined by the ec recon)) and the node index as a new header based on, well, the node index. The rcvr can now validate the incoming pair to reject (400) when a primary node is being asked to accept fragments that don't belong to it. Additionally, by having the frag index the rcvr can reject (409) an attempt to accept a fragment when its a handoff and already has one that needs to be reverted. Fixes-bug: #1452619 Change-Id: I8287b274bbbd00903c1975fe49375590af697be4	2015-06-19 16:30:11 -07:00
janonymous	09e7477a39	Replace it.next() with next(it) for py3 compat The Python 2 next() method of iterators was renamed to __next__() on Python 3. Use the builtin next() function instead which works on Python 2 and Python 3. Change-Id: Ic948bc574b58f1d28c5c58e3985906dee17fa51d	2015-06-15 22:10:45 +05:30
Alistair Coles	191f2a00bd	Remove _ensure_flush() from SSYNC receiver The Receiver._ensure_flush() method in ssync_receiver.py has the following comment: Sends a blank line sufficient to flush buffers. This is to ensure Eventlet versions that don't support eventlet.minimum_write_chunk_size will send any previous data buffered. If https://bitbucket.org/eventlet/eventlet/pull-request/37 ever gets released in an Eventlet version, we should make this yield only for versions older than that. The reference pull request was included with eventlet 0.14 [1] and swift now requires >=0.16.1 so it is safe to remove _ensure_flush() and save > 8k bytes per SSYNC response. [1] `4bd654205a` Change-Id: I367e9a6e92b7ea75fe7e5795cded212657de57ed	2015-05-27 15:01:43 +01:00
Alistair Coles	3aa06f185a	Make SSYNC receiver return a reponse when initial checks fail The ssync Receiver performs some checks on request parameters in initialize_request() before starting the exchange of missing hashes and updates e.g. the destination device must be available; the policy must be valid. Currently if any of these checks fails then the receiver just closes the connection, so the Sender gets no useful response code and noise is generated in logs by httplib and wsgi Exceptions. This change moves the request parameter checks to the Receiver constructor so that the HTTPExceptions raised are actually sent as responses. (The 'connection close' exception handling still applies once the 'missing_check' and 'updates' handshakes are in progress.) Moving initialize_request() revealed the following lurking bug: * initialize_request() sets req.environ['eventlet.minimum_write_chunk_size'] = 0 * this was previously ineffective because the Response environ had already been copied from Request environ before this value was set, so the Response never used the value :/ * Now that it is effective (a good thing) it causes the empty string yielded by the receiver when there are no missing hashes in missing_checks() to be sent to the sender immediately. This makes the Sender.readline() think there has been an early disconnect and raise an Exception (a bad thing), as revealed by test/unit/obj/test_ssync_sender.py:TestSsync.test_nothing_to_sync The fix for this is to simply make the receiver skip sending the empty string if there are no missing object_hashes. Change-Id: I036a6919fead6e970505dccbb0da7bfbdf8cecc3	2015-05-27 15:01:11 +01:00
Alistair Coles	98b725fec6	Cleanup and extend end to end ssync tests Extends the existing end to end ssync tests with a test using replication policy. Also some cleanup and improvements to the test framework e.g. rather than faking the connection between sender and receiver, use a real connection and wrap it to capture traffic for verification. Change-Id: Id71d2eb3fb8fa15c016ef151aacf95f97196a902	2015-05-13 11:05:13 +01:00
paul luse	647b66a2ce	Erasure Code Reconstructor This patch adds the erasure code reconstructor. It follows the design of the replicator but: - There is no notion of update() or update_deleted(). - There is a single job processor - Jobs are processed partition by partition. - At the end of processing a rebalanced or handoff partition, the reconstructor will remove successfully reverted objects if any. And various ssync changes such as the addition of reconstruct_fa() function called from ssync_sender which performs the actual reconstruction while sending the object to the receiver Co-Authored-By: Alistair Coles <alistair.coles@hp.com> Co-Authored-By: Thiago da Silva <thiago@redhat.com> Co-Authored-By: John Dickinson <me@not.mn> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com> Co-Authored-By: Samuel Merritt <sam@swiftstack.com> Co-Authored-By: Christian Schwede <christian.schwede@enovance.com> Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com> blueprint ec-reconstructor Change-Id: I7d15620dc66ee646b223bb9fff700796cd6bef51	2015-04-14 00:52:17 -07:00
Alistair Coles	fa89064933	Per-policy DiskFile classes Adds specific disk file classes for EC policy types. The new ECDiskFile and ECDiskFileWriter classes are used by the ECDiskFileManager. ECDiskFileManager is registered with the DiskFileRouter for use with EC_POLICY type policies. Refactors diskfile tests into BaseDiskFileMixin and BaseDiskFileManagerMixin classes which are then extended in subclasses for the legacy replication-type DiskFile* and ECDiskFile* classes. Refactor to prefer use of a policy instance reference over a policy_index int to refer to a policy. Add additional verification to DiskFileManager.get_dev_path to validate the device root with common.constraints.check_dir, even when mount_check is disabled for use in on a virtual swift-all-in-one. Co-Authored-By: Thiago da Silva <thiago@redhat.com> Co-Authored-By: John Dickinson <me@not.mn> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com> Co-Authored-By: Paul Luse <paul.e.luse@intel.com> Co-Authored-By: Samuel Merritt <sam@swiftstack.com> Co-Authored-By: Christian Schwede <christian.schwede@enovance.com> Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com> Change-Id: I22f915160dc67a9e18f4738c1ddf068344e8ad5d	2015-04-14 00:52:16 -07:00
Jenkins	28c99763e9	Merge "Fix ssync send_delete"	2015-02-13 00:18:38 +00:00
Alistair Coles	82e5090848	Fix ssync send_delete The ssync_sender send_delete method treats its timestamp argument as a string when in fact it is passed a Timestamp object. As a result the method always raises an exception and deletes are never replicated. This patch fixes bug and adds unit and probe tests to verify expected behavior. Closes-Bug: 1421425 Change-Id: I664fb8d5dfea7362313037a67927ea90021c3f62	2015-02-12 21:44:36 +00:00
Kota Tsuyuzaki	20ca279d74	Efficient Replication for Distributed Regions This change provides a efficient way of replication between regions of a global distributed cluster. This approach makes object-replicator to push replicas to a primary node in a remote region, then, to skip pushing them to next primary node in the region with expecting asynchronous replication. This implementation includes a couple of changes on ssync_sender to allow object-replicator to delete local handoff objects correctly. One is to return a list of existing objects in remote region. The list includes local paths of the objects which exist both on the local device and the remote device. The other is supporting existence check for specified objects. It requires the object list build by the first change. When the object list is given, ssync_sender does only missing_check based on the list. These changes are needed because current swift can not handle the existence check in object-level. Note that this feature will work partially (i.e. only when primary-to-primary) with rsync. Implements: blueprint efficient-replication Change-Id: I5d990444d7977f4127bb37f9256212c893438df1	2015-02-10 12:52:15 -08:00
Michael Barton	556568b1c3	use replication_ip in ssync Update ssync_sender to use replication_ip and replication_port from the ring. Those attributes are supposed to allow for a separate replication network, and are used by rsync replication. Change-Id: Ib4cc3cbc1503b85dfdfa0edab58a49c95eac5993	2014-10-16 01:56:48 +00:00
Paul Luse	873c52e608	Replace POLICY and POLICY_INDEX with string literals Replaced throughout code base & tox'd. Functional as well as probe tests pass with and without policies defined. POLICY --> 'X-Storage-Policy' POLICY_INDEX --> 'X-Backend-Storage-Policy-Index' Change-Id: Iea3d06de80210e9e504e296d4572583d7ffabeac	2014-06-23 12:52:50 -07:00
Paul Luse	b9707d497c	Add Storage Policy Support to ssync This patch makes ssync policy aware so that clusters using storage policies and ssync replication will replicate objects in all policies. DocImpact Implements: blueprint storage-policies Change-Id: I64879077676d764c6330e03734fc6665bb26f552	2014-06-18 17:31:38 -07:00
gholt	70fc7df6eb	Just trying to keep /tmp clean Change-Id: Ia8d7cf37a4f6a4652cb3440a896cefb411cdb41a	2013-12-16 17:14:00 +00:00
Clay Gerrard	b57fd4343f	use diskfile in ssync_sender tests Change-Id: I7993de98ce3eb4839fa5d72d1b6ce08e4a7c1451	2013-12-03 21:12:19 -08:00
gholt	a80c720af5	Object replication ssync (an rsync alternative) For this commit, ssync is just a direct replacement for how we use rsync. Assuming we switch over to ssync completely someday and drop rsync, we will then be able to improve the algorithms even further (removing local objects as we successfully transfer each one rather than waiting for whole partitions, using an index.db with hash-trees, etc., etc.) For easier review, this commit can be thought of in distinct parts: 1) New global_conf_callback functionality for allowing services to perform setup code before workers, etc. are launched. (This is then used by ssync in the object server to create a cross-worker semaphore to restrict concurrent incoming replication.) 2) A bit of shifting of items up from object server and replicator to diskfile or DEFAULT conf sections for better sharing of the same settings. conn_timeout, node_timeout, client_timeout, network_chunk_size, disk_chunk_size. 3) Modifications to the object server and replicator to optionally use ssync in place of rsync. This is done in a generic enough way that switching to FutureSync should be easy someday. 4) The biggest part, and (at least for now) completely optional part, are the new ssync_sender and ssync_receiver files. Nice and isolated for easier testing and visibility into test coverage, etc. All the usual logging, statsd, recon, etc. instrumentation is still there when using ssync, just as it is when using rsync. Beyond the essential error and exceptional condition logging, I have not added any additional instrumentation at this time. Unless there is something someone finds super pressing to have added to the logging, I think such additions would be better as separate change reviews. FOR NOW, IT IS NOT RECOMMENDED TO USE SSYNC ON PRODUCTION CLUSTERS. Some of us will be in a limited fashion to look for any subtle issues, tuning, etc. but generally ssync is an experimental feature. In its current implementation it is probably going to be a bit slower than rsync, but if all goes according to plan it will end up much faster. There are no comparisions yet between ssync and rsync other than some raw virtual machine testing I've done to show it should compete well enough once we can put it in use in the real world. If you Tweet, Google+, or whatever, be sure to indicate it's experimental. It'd be best to keep it out of deployment guides, howtos, etc. until we all figure out if we like it, find it to be stable, etc. Change-Id: If003dcc6f4109e2d2a42f4873a0779110fff16d6	2013-11-07 16:52:01 +00:00

33 Commits