This is an alternative approach to that proposed in [1]
Adds support for optional per-policy config sections
to be added in proxy-server.conf. This is highly desirable
to allow per-policy affinity options to be set for use with
duplicated EC policies [2] and composite rings [3].
Certain options found in per-policy conf sections will
override their equivalents that may be set in the
[app:proxy-server] section. Currently the options
handled that way are:
sorting_method
read_affinity
write_affinity
write_affinity_node_count
For example:
[proxy-server:policy:0]
sorting_method = affinity
read_affinity = r1=100
write_affinity = r1
write_affinity_node_count = 1 * replicas
The corresponding attributes of the proxy-server Application
are now available from instances of an OverrideConf object
that is obtained from Application.get_policy_options(policy).
[1] Related-Change: I9104fc789ba85ab3ab5ccd34096125b482821389
[2] Related-Change: Idd155401982a2c48110c30b480966a863f6bd305
[3] Related-Change: I0d8928b55020592f8e75321d1f7678688301d797
Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Change-Id: I3f718f425f525baa80045ba067950c752bcaaefc
We've got these lovely __enter__ and __exit__ methods; let's use them!
Note that this also changes how we patch classes' setUp methods so we
don't set self._orig_POLICIES when the class is already patched. I
hope this may fix some sporadic failures that include tracebacks
that look like
proxy ERROR: ERROR 500 Traceback (most recent call last):
File ".../swift/obj/server.py", line 1105, in __call__
res = getattr(self, req.method)(req)
File ".../swift/common/utils.py", line 1626, in _timing_stats
resp = func(ctrl, *args, **kwargs)
File ".../swift/obj/server.py", line 880, in GET
policy=policy, frag_prefs=frag_prefs)
File ".../swift/obj/server.py", line 211, in get_diskfile
return self._diskfile_router[policy].get_diskfile(
File ".../swift/obj/diskfile.py", line 555, in __getitem__
return self.policy_to_manager[policy]
KeyError: ECStoragePolicy(...)
... and try to unpatch more gracefully with TestCase.addCleanup
Change-Id: Iaa3d42ec21758b0707155878a645e665aa36696c
This is a follow up for https://review.openstack.org/#/c/441921
all of this patch is for minor fixes, I found in my self-review.
Change-Id: Ib3a1dc983c3da69dea592114e25a5047ec91a2b9
Previously, we were testing that a 254 (!?) character name would be valid
when the maximum configured is 500. Now we'll test that 500 character
names are valid.
While we're at it, stop patching self.test_check. It was unnecessary,
and we were doing it badly.
Change-Id: Ia604fa7b809a97fbce176c82606af73cdb92828c
* Adds a composite_builder module which provides the functionality to
build a composite ring from a number of component ring builders.
* Add id to RingBuilder to differentiate rings in composite.
A RingBuilder now gets a UUID when it is saved to file if
it does not already have one. A RingBuilder loaded from
file does NOT get a UUID assigned unless it was previously persisted in
the file. This forces users to explicitly assign an id to
existing ring builders by saving the state back to file.
The UUID is included in first line of the output from:
swift-ring-builder <builder-file>
Background:
This is another implementation for Composite Ring [1]
to enable better dispersion for global erasure coded cluster.
The most significant difference from the related-change [1] is that this
solution attempts to solve the problem as an offline tool rather than
dynamic compositing on the running servers. Due to the change, we gain
advantages such as:
- Less code and being simple
- No complex state validation on the running server
- Easy deployments with an offline tool
This patch does not provide a command line utility for managing
composite rings. The interface for such a tool is still under
discussion; this patch provides the enabling functionality first.
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
[1] Related-Change: I80ef36d3ac4d4b7c97a1d034b7fc8e0dc2214d16
Change-Id: I0d8928b55020592f8e75321d1f7678688301d797
domain_remap strip all starting/ending slashes. This behavior does not
allow to access objects starting or ending with slash. It is also
impacting staticweb middleware as staticweb tries to redirect
pseudo-directory to an URL with an ending slash, but as domain_remap
strip it, it goes to an infinite loop.
With this commit, the path of the request is passed as-is when
reconstructing the new request path. Example
http://www.example.com//obj/ was previously rewritten to
http://storage.example.com/v1/AUTH_abc/container/obj. It is now
rewritten to http://storage.example.com/v1/AUTH_abc/container//obj/
Closes-Bug: #1682293
Co-Authored-By: Christian Schwede <cschwede@redhat.com>
Change-Id: I1ef6b8752183d27103a3b0e720edcb4ce06fb837
Recently out gate started blowing up intermittently with a strange
case of ports mixed up. Sometimes a functional tests tries to
authorize on a port that's clearly an object server port, and
the like. As it turns out, eventlet developers added an unavoidable
SO_REUSEPORT into listen(), which makes listen(("localhost",0)
to reuse ports.
There's an issue about it:
https://github.com/eventlet/eventlet/issues/411
This patch is working around the problem while eventlet people
consider the issue.
Change-Id: I67522909f96495a6a30e1acdb79835dce2189549
Module setup() and teardown() functions are found by nosetests [1] but
unittests expects setUpModule() and tearDownModule() [2]. The latter
function names are also found by nosetests, so using those function
names enables the test module to be run with either nosetests or
unittest.
Although the tox test envs and .unittests script use nosetests, this
change allows the convenience of using unittest, for example when it
is the default test runner in a development environment such as
PyCharm.
This change also makes it unnecessary to explicitly call the setup()
and teardown() functions when executing the module directly.
[1] http://nose.readthedocs.io/en/latest/writing_tests.html#test-modules
[2] https://docs.python.org/2/library/unittest.html#setupmodule-and-teardownmodule
Change-Id: Ib2e5470a339af1f937b25d643b64356e8848ed36
Currently, EC GET responses from proxy to clients, unlike any other
response, include a "Connection: close" header. If the client has sent
a "Connection: keep-alive" header then eventlet.wsgi appends this to
the client response, so clients can receive a response with both
headers:
Connection: close
Connection: keep-alive
This patch fixes the proxy EC GET path to remove any Connection header
from it's response, but does not change the behaviour of eventlet.wsgi
with respect to returning any client provided 'Connection: keep-alive'
header.
Change-Id: I43cd27c978edb4a1a587f031dbbee26e9acdc920
Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Closes-Bug: #1680731
From the docs for LogRecord.message [1],
> This is set when Formatter.format() is invoked.
Apparently we may find ourselves in a situation [2] where that never
happens? Really weird that it failed *midway* through the test though;
maybe some concurrent test removed all formatters?
ERROR: test_known_bad_ec_config
(test.unit.common.test_storage_policy.TestStoragePolicies)
----------------------------------------------------------------------
Traceback (most recent call last):
File ".../mock/mock.py", line 1305, in patched
return func(*args, **keywargs)
File ".../test/unit/common/test_storage_policy.py", line 688, in test_known_bad_ec_config
self.assertIn(msg, records[0].message)
AttributeError: 'LogRecord' object has no attribute 'message'
[1] https://docs.python.org/2/library/logging.html#logrecord-attributes
[2] http://logs.openstack.org/59/460359/1/check/gate-swift-tox-xfs-tmp-py27-ubuntu-xenial/5ecc2cb/console.html#_2017-04-27_01_06_43_346096
Change-Id: I8f5ac0ec1195a233f14edc0126de1d1cea7a6e2f
The proxy server on occasion has error limited a node by the time the
test runs, causing the proxie's node_iter failing to iter out this
error limited node. As the test uses a default FakeRing with no
extra handoffs, on this occasion we only get 2 requests which is not
enough for quorum, causing it to return a 503.
This patch sets the error_suppression_interval to 0 when creating
the proxy server. Meaning a node effectively isn't error_limited.
Change-Id: I96cf4c4d63594f803cc1cd57e874d1624db8e249
Closes-Bug: #1682026
In particular, in TestObjectController.test_object_delete_at_async_update
Rarely (<0.1% of the time?), it would fail with:
======================================================================
FAIL: test_object_delete_at_async_update
(test.unit.obj.test_server.TestObjectController)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/vagrant/swift/test/unit/obj/test_server.py", line 4826, in
test_object_delete_at_async_update
resp = req.get_response(self.object_controller)
File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/vagrant/swift/test/unit/__init__.py", line 1075, in
mocked_http_conn
raise AssertionError('left over status %r' % left_over_status)
AssertionError: left over status [500, 500]
-------------------- >> begin captured stdout << ---------------------
test INFO: None - - [26/Apr/2017:22:32:13 +0000] "PUT /sda1/p/a/c/o" 400
19 "-" "-" "-" 0.0003 "-" 23801 0
--------------------- >> end captured stdout << ----------------------
>> raise AssertionError('left over status %r' % [500, 500])
----------------------------------------------------------------------
Related-Bug: 1514111
Change-Id: I1af4a291fb67cf4b1829f167998a08644117a800
... and remove some cruft that couldn't possibly work
Change-Id: I560f0a29f0a881c63ec3cb910dbf5476fe2a915a
Related-Change-Id: I0d8928b55020592f8e75321d1f7678688301d797
Include a couple trivial cases, and verify that surrogate pairs get
collapsed.
Also, move it to a more-appropriate class.
Related-Change: I4c570c08c770636d57b1157e19d5b7034fd9ed4e (patchset 3)
Change-Id: Iab0fdafe08d06a9d677dc421e60779e94d27ba9b
EC object metadata can currently have a mixture of bytestrings and
unicode. The ssync_sender.send_put() method raises an
UnicodeDecodeError when it attempts to concatenate the metadata
values, if any bytestring has non-ascii characters.
The root cause of this issue is that the object server uses unicode
for the keys of some object metadata items that are received in the
footer of an EC PUT request, whereas all other object metadata keys
and values are persisted as bytestrings.
This patch fixes the bug by changing diskfile write_metadata()
function to encode all unicode metadata keys and values as utf8
encoded bytes before writing to disk. To cope with existing objects
that have a mixture of unicode and bytestring metadata, the diskfile
read_metadata() function is also changed so that all returned unicode
metadata keys and values are utf8 encoded. This ensures that
ssync_sender.send_put() (and any other caller of diskfile
read_metadata) only reads bytestrings from object metadata.
Closes-Bug: #1678018
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: Ic23c55754ee142f6f5388dcda592a3afc9845c39
IMHO we shouldn't ever trust the invalidations file so much we try to
skip a listdir when creating a hashes.pkl for the first time. There may
be some subtle races looking back on the related patch, and it's related
patches.
This just makes some assertions to help demonstrate we should maintain
the invariant of setting hashes to valid via listdir.
Change-Id: I767e34a405de7911e9596e038e58a9a29f57a8f8
Related-Change-Id: I08c8cf09282f737103e580c1f57923b399abe58c
The refactoring in the Related-Change separated EC
specific object controller tests into EC specific TestCase
classes, but left two EC specific tests in the Replication
object controller test class. This patch moves them to the
appropriate test class.
Previously the tests were only executed once, now they are
executed in each of two subclasses using different EC
policies. As a result it was necessary to make the test
container name unique to the policy under test.
Related-Change: Ifd3d0fa66773e640bb61cc528f7a1b2358e97d91
Change-Id: Ie712ea91b5dd74c504a0dd6aa40c3d657277108c
Due to the refactoring of TestObjectController (related-change),
all of BaseTestECObjectController test methods are not being
needed to be unpatched because they are expected to run for test
setup-ed policies.
This patch works for items as follows:
- Move part of setUp/tearDown routines at BaseTestObjectController
needed by only TestReplicatedObjectController which affects
patch_policies
- Remove all unpatch_policies from BaseTestECObjectController
- Set up self.ec_policy to avoid to set policy index and
retrieve the policy for each test method.
The reason why I didn't squash this up to the related parent patch is
to clarify what was changed at those patches. The parent is for just
clustering the tests for each test class and this one attempts to
improve.
Related-Change: Idd155401982a2c48110c30b480966a863f6bd305
Change-Id: I25a3f8fc837706d78dca226fe282d9e5ead65a0d
Object paths can have non-ascii characters. Device dicts will
have unicode values. Forming a string using both will cause the
object path to be coerced to UTF8, which currently causes a
UnicodeDecodeError. This causes _get_response() to not return
and the recosntructor hangs.
The call to _full_path() is moved outside of _get_response()
(where its result is used in the exception handler logging)
so that _get_response() will always return even if _full_path()
raises an exception.
Unit tests are refactored to split out a new class with those
tests using an object name and the _full_path method, so that
the class can be subclassed to use an object name with non-ascii
characters.
Existing probe tests are subclassed to repeat using non-ascii
chars in object paths.
Change-Id: I4c570c08c770636d57b1157e19d5b7034fd9ed4e
Closes-Bug: 1679175
This patch uses a more specific string to match
the ring name in recon cli.
Almost all the places in the project where need to get the
suffix (like ring.gz, ts, data and .etc) will include the '.'
in front, if a file named 'object.sring.gz' in the swift_dir
will be added in the ring_names, which is not what we want.
Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Closes-Bug: #1680704
Change-Id: Ida659fa71585f9b0cf36da75b58b28e6a25533df
When drive with container or account database is unmounted
replicator pushes database to handoff location. But this
handoff location finds replica with unmounted drive and
pushes database to the *next* handoff until all handoffs has
a replica - all container/account servers has replicas of
all unmounted drives.
This patch solves:
- Consumption of iterator on handoff location that results in
replication to the next and next handoff.
- StopIteration exception stopped not finished loop over
available handoffs if no more nodes exists for db replication
candidency.
Regression was introduced in 2.4.0 with rsync compression.
Co-Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Change-Id: I344f9daaa038c6946be11e1cf8c4ef104a09e68b
Closes-Bug: 1675500
SSYNC is designed to limit concurrent incoming connections in order to
prevent IO contention. The reconstructor should expect remote
replication servers to fail ssync_sender when the remote is too busy.
When the remote rejects SSYNC - it should avoid forcing additional IO
against the remote with a REPLICATE request which causes suffix
rehashing.
Suffix rehashing via REPLICATE verbs takes two forms:
1) a initial pre-flight call to REPLICATE /dev/part will cause a remote
primary to rehash any invalid suffixes and return a map for the local
sender to compare so that a sync can be performed on any mis-matched
suffixes.
2) a final call to REPLICATE /dev/part/suf1-suf2-suf3[-sufX[...]] will
cause the remote primary to rehash the *given* suffixes even if they are
*not* invalid. This is a requirement for rsync replication because
after a suffix is synced via rsync the contents of a suffix dir will
likely have changed and the remote server needs to update it hashes.pkl
to reflect the new data.
SSYNC does not *need* to send a post-sync REPLICATE request. Any
suffixes that are modified by the SSYNC protocol will call _finalize_put
under the hood as it is syncing. It is however not harmful and
potentially useful to go ahead refresh hashes after an SSYNC while the
inodes of those suffixes are warm in the cache.
However, that only makes sense if the SSYNC conversation actually synced
any suffixes - if SSYNC is rejected for concurrency before it ever got
started there is no value in the remote performing a rehash. It may be
that *another* reconstructor is pushing data into that same partition
and the suffixes will become immediately invalidated.
If a ssync_sender does not successful finish a sync the reconstructor
should skip the REPLICATE call entirely and move on to the next
partition without causing any useless remote IO.
Closes-Bug: #1665141
Change-Id: Ia72c407247e4525ef071a1728750850807ae8231
Some public functions in the diskfile manager expect or return full
file paths. It implies a filesystem diskfile implementation.
To make it easier to plug alternate diskfile implementations, patch
functions to take more generic arguments.
This commit changes DiskFileManager _get_hashes() arguments from:
- partition_path, recalculate=None, do_listdir=False
to :
- device, partition, policy, recalculate=None, do_listdir=False
Callers are modified accordingly, in diskfile.py, reconstructor.py,
and replicator.py
Change-Id: I8e2d7075572e466ae2fa5ebef5e31d87eed90fec