3088 Commits

Author SHA1 Message Date
Jenkins
96de9ad126 Merge "Clean up how PatchPolicies works" 2017-05-25 09:20:52 +00:00
Jenkins
263dc8a3f3 Merge "Enable per policy proxy config options" 2017-05-25 06:34:48 +00:00
Jenkins
24e8789689 Merge "Small minor fixes for composite ring functionality" 2017-05-24 22:57:30 +00:00
Jenkins
6f7b1f9ee2 Merge "Use setUpModule instead of setup for module level unit test setup" 2017-05-24 21:29:09 +00:00
Alistair Coles
45884c1102 Enable per policy proxy config options
This is an alternative approach to that proposed in [1]

Adds support for optional per-policy config sections
to be added in proxy-server.conf. This is highly desirable
to allow per-policy affinity options to be set for use with
duplicated EC policies [2] and composite rings [3].

Certain options found in per-policy conf sections will
override their equivalents that may be set in the
[app:proxy-server] section. Currently the options
handled that way are:

  sorting_method
  read_affinity
  write_affinity
  write_affinity_node_count

For example:

  [proxy-server:policy:0]
  sorting_method = affinity
  read_affinity = r1=100
  write_affinity = r1
  write_affinity_node_count = 1 * replicas

The corresponding attributes of the proxy-server Application
are now available from instances of an OverrideConf object
that is obtained from Application.get_policy_options(policy).

[1] Related-Change: I9104fc789ba85ab3ab5ccd34096125b482821389
[2] Related-Change: Idd155401982a2c48110c30b480966a863f6bd305
[3] Related-Change: I0d8928b55020592f8e75321d1f7678688301d797

Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Change-Id: I3f718f425f525baa80045ba067950c752bcaaefc
2017-05-23 20:22:30 +01:00
Tim Burke
1b991803e8 Clean up how PatchPolicies works
We've got these lovely __enter__ and __exit__ methods; let's use them!

Note that this also changes how we patch classes' setUp methods so we
don't set self._orig_POLICIES when the class is already patched.  I
hope this may fix some sporadic failures that include tracebacks
that look like

  proxy ERROR: ERROR 500 Traceback (most recent call last):
    File ".../swift/obj/server.py", line 1105, in __call__
      res = getattr(self, req.method)(req)
    File ".../swift/common/utils.py", line 1626, in _timing_stats
      resp = func(ctrl, *args, **kwargs)
    File ".../swift/obj/server.py", line 880, in GET
      policy=policy, frag_prefs=frag_prefs)
    File ".../swift/obj/server.py", line 211, in get_diskfile
      return self._diskfile_router[policy].get_diskfile(
    File ".../swift/obj/diskfile.py", line 555, in __getitem__
      return self.policy_to_manager[policy]
  KeyError: ECStoragePolicy(...)

... and try to unpatch more gracefully with TestCase.addCleanup

Change-Id: Iaa3d42ec21758b0707155878a645e665aa36696c
2017-05-19 17:59:36 -07:00
Jenkins
9089e44c0b Merge "Add Composite Ring Functionality" 2017-05-18 10:18:31 +00:00
Kota Tsuyuzaki
4dc985a1fa Small minor fixes for composite ring functionality
This is a follow up for https://review.openstack.org/#/c/441921
all of this patch is for minor fixes, I found in my self-review.

Change-Id: Ib3a1dc983c3da69dea592114e25a5047ec91a2b9
2017-05-18 01:48:14 -07:00
Tim Burke
582af7cd9d name_check: better test maximum_length
Previously, we were testing that a 254 (!?) character name would be valid
when the maximum configured is 500. Now we'll test that 500 character
names are valid.

While we're at it, stop patching self.test_check. It was unnecessary,
and we were doing it badly.

Change-Id: Ia604fa7b809a97fbce176c82606af73cdb92828c
2017-05-16 17:59:52 -07:00
Kota Tsuyuzaki
d40031b46f Add Composite Ring Functionality
* Adds a composite_builder module which provides the functionality to
  build a composite ring from a number of component ring builders.

* Add id to RingBuilder to differentiate rings in composite.
  A RingBuilder now gets a UUID when it is saved to file if
  it does not already have one. A RingBuilder loaded from
  file does NOT get a UUID assigned unless it was previously persisted in
  the file. This forces users to explicitly assign an id to
  existing ring builders by saving the state back to file.

  The UUID is included in first line of the output from:

    swift-ring-builder <builder-file>

Background:

This is another implementation for Composite Ring [1]
to enable better dispersion for global erasure coded cluster.

The most significant difference from the related-change [1] is that this
solution attempts to solve the problem as an offline tool rather than
dynamic compositing on the running servers. Due to the change, we gain
advantages such as:

- Less code and being simple
- No complex state validation on the running server
- Easy deployments with an offline tool

This patch does not provide a command line utility for managing
composite rings. The interface for such a tool is still under
discussion; this patch provides the enabling functionality first.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>

[1] Related-Change: I80ef36d3ac4d4b7c97a1d034b7fc8e0dc2214d16
Change-Id: I0d8928b55020592f8e75321d1f7678688301d797
2017-05-15 16:42:00 -07:00
Jenkins
4a19917827 Merge "Open-code eventlet.listen()" 2017-05-12 16:46:28 +00:00
Romain LE DISEZ
6db12b87ff Fix domain_remap when obj starts/ends with slash
domain_remap strip all starting/ending slashes. This behavior does not
allow to access objects starting or ending with slash. It is also
impacting staticweb middleware as staticweb tries to redirect
pseudo-directory to an URL with an ending slash, but as domain_remap
strip it, it goes to an infinite loop.

With this commit, the path of the request is passed as-is when
reconstructing the new request path. Example
http://www.example.com//obj/ was previously rewritten to
http://storage.example.com/v1/AUTH_abc/container/obj. It is now
rewritten to http://storage.example.com/v1/AUTH_abc/container//obj/

Closes-Bug: #1682293
Co-Authored-By: Christian Schwede <cschwede@redhat.com>
Change-Id: I1ef6b8752183d27103a3b0e720edcb4ce06fb837
2017-05-11 08:47:23 -04:00
Pete Zaitcev
5dfc3a75fb Open-code eventlet.listen()
Recently out gate started blowing up intermittently with a strange
case of ports mixed up. Sometimes a functional tests tries to
authorize on a port that's clearly an object server port, and
the like. As it turns out, eventlet developers added an unavoidable
SO_REUSEPORT into listen(), which makes listen(("localhost",0)
to reuse ports.

There's an issue about it:
 https://github.com/eventlet/eventlet/issues/411

This patch is working around the problem while eventlet people
consider the issue.

Change-Id: I67522909f96495a6a30e1acdb79835dce2189549
2017-05-11 01:39:14 -06:00
Jenkins
1f36582efb Merge "Fix unit tests on i386 and other archs" 2017-05-10 19:58:45 +00:00
Jenkins
2abffb99b9 Merge "Fix sporadic failure in TestObjectController.test_container_update_async" 2017-05-10 14:51:35 +00:00
Jenkins
30898435b1 Merge "Stop including Connection header in EC GET responses" 2017-05-05 22:12:05 +00:00
Tim Burke
50357de575 Fix sporadic failure in TestObjectController.test_container_update_async
Change-Id: Ie4d58626ebe97049703802a43c669cc78cf60f8b
Related-Change: I15f36e191cfe3ee6c82b4be56e8618ec0230e328
Closes-Bug: #1589994
2017-05-05 00:11:39 +00:00
Alistair Coles
511ac2ee60 Use setUpModule instead of setup for module level unit test setup
Module setup() and teardown() functions are found by nosetests [1] but
unittests expects setUpModule() and tearDownModule() [2]. The latter
function names are also found by nosetests, so using those function
names enables the test module to be run with either nosetests or
unittest.

Although the tox test envs and .unittests script use nosetests, this
change allows the convenience of using unittest, for example when it
is the default test runner in a development environment such as
PyCharm.

This change also makes it unnecessary to explicitly call the setup()
and teardown() functions when executing the module directly.

[1] http://nose.readthedocs.io/en/latest/writing_tests.html#test-modules
[2] https://docs.python.org/2/library/unittest.html#setupmodule-and-teardownmodule

Change-Id: Ib2e5470a339af1f937b25d643b64356e8848ed36
2017-05-04 12:47:17 +01:00
Jenkins
dd3bc8fe61 Merge "Move EC-specific unit test to EC Test class" 2017-05-03 23:20:36 +00:00
Jenkins
33a422af9f Merge "Fix (un)patch_policies" 2017-05-03 21:11:39 +00:00
Jenkins
4e61735e53 Merge "Use LogRecord.msg instead of LogRecord.message in tests" 2017-05-03 10:36:02 +00:00
Jenkins
d7a6d6e1e9 Merge "Do not sync suffixes when remote rejects reconstructor revert" 2017-05-01 20:38:07 +00:00
Alistair Coles
6c320b2908 Stop including Connection header in EC GET responses
Currently, EC GET responses from proxy to clients, unlike any other
response, include a "Connection: close" header. If the client has sent
a "Connection: keep-alive" header then eventlet.wsgi appends this to
the client response, so clients can receive a response with both
headers:

Connection: close
Connection: keep-alive

This patch fixes the proxy EC GET path to remove any Connection header
from it's response, but does not change the behaviour of eventlet.wsgi
with respect to returning any client provided 'Connection: keep-alive'
header.

Change-Id: I43cd27c978edb4a1a587f031dbbee26e9acdc920
Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Closes-Bug:  #1680731
2017-05-01 18:21:18 +01:00
Tim Burke
387ce13aa1 Use LogRecord.msg instead of LogRecord.message in tests
From the docs for LogRecord.message [1],

> This is set when Formatter.format() is invoked.

Apparently we may find ourselves in a situation [2] where that never
happens? Really weird that it failed *midway* through the test though;
maybe some concurrent test removed all formatters?

ERROR: test_known_bad_ec_config
(test.unit.common.test_storage_policy.TestStoragePolicies)
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".../mock/mock.py", line 1305, in patched
    return func(*args, **keywargs)
  File ".../test/unit/common/test_storage_policy.py", line 688, in test_known_bad_ec_config
    self.assertIn(msg, records[0].message)
AttributeError: 'LogRecord' object has no attribute 'message'

[1] https://docs.python.org/2/library/logging.html#logrecord-attributes
[2] http://logs.openstack.org/59/460359/1/check/gate-swift-tox-xfs-tmp-py27-ubuntu-xenial/5ecc2cb/console.html#_2017-04-27_01_06_43_346096
Change-Id: I8f5ac0ec1195a233f14edc0126de1d1cea7a6e2f
2017-04-28 15:56:25 -07:00
Jenkins
e1b74c83c4 Merge "Fix sporadic failure in TestAccountController unit test" 2017-04-28 03:19:26 +00:00
Matthew Oliver
a07f7dc8c0 Fix sporadic failure in TestAccountController unit test
The proxy server on occasion has error limited a node by the time the
test runs, causing the proxie's node_iter failing to iter out this
error limited  node. As the test uses a default FakeRing with no
extra handoffs, on this occasion we only get 2 requests which is not
enough for quorum, causing it to return a 503.

This patch sets the error_suppression_interval to 0 when creating
the proxy server. Meaning a node effectively isn't error_limited.

Change-Id: I96cf4c4d63594f803cc1cd57e874d1624db8e249
Closes-Bug: #1682026
2017-04-27 01:03:29 +00:00
Tim Burke
20072570d9 Fix sporadic failure in test/unit/obj/test_server.py
In particular, in TestObjectController.test_object_delete_at_async_update

Rarely (<0.1% of the time?), it would fail with:

======================================================================
FAIL: test_object_delete_at_async_update
(test.unit.obj.test_server.TestObjectController)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/vagrant/swift/test/unit/obj/test_server.py", line 4826, in
test_object_delete_at_async_update
    resp = req.get_response(self.object_controller)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/vagrant/swift/test/unit/__init__.py", line 1075, in
mocked_http_conn
    raise AssertionError('left over status %r' % left_over_status)
AssertionError: left over status [500, 500]
-------------------- >> begin captured stdout << ---------------------
test INFO: None - - [26/Apr/2017:22:32:13 +0000] "PUT /sda1/p/a/c/o" 400
19 "-" "-" "-" 0.0003 "-" 23801 0

--------------------- >> end captured stdout << ----------------------
>>  raise AssertionError('left over status %r' % [500, 500])

----------------------------------------------------------------------

Related-Bug: 1514111
Change-Id: I1af4a291fb67cf4b1829f167998a08644117a800
2017-04-26 15:51:16 -07:00
Jenkins
20f7e5f857 Merge "Improve test_get_valid_utf8_str coverage" 2017-04-26 20:09:04 +00:00
Clay Gerrard
6be5196fbe Make add_dev complain louder about missing keys
... and remove some cruft that couldn't possibly work

Change-Id: I560f0a29f0a881c63ec3cb910dbf5476fe2a915a
Related-Change-Id: I0d8928b55020592f8e75321d1f7678688301d797
2017-04-25 19:29:57 -07:00
Ondřej Nový
9e15effb3b Fix unit tests on i386 and other archs
Change-Id: I4f84b725e220e28919570fd7f296b63b34d0375d
2017-04-24 21:40:31 +00:00
Tim Burke
1776e0fd20 Improve test_get_valid_utf8_str coverage
Include a couple trivial cases, and verify that surrogate pairs get
collapsed.

Also, move it to a more-appropriate class.

Related-Change: I4c570c08c770636d57b1157e19d5b7034fd9ed4e (patchset 3)
Change-Id: Iab0fdafe08d06a9d677dc421e60779e94d27ba9b
2017-04-20 15:54:54 -07:00
Jenkins
cce5482bd8 Merge "Fix encoding issue in ssync_sender.send_put()" 2017-04-19 19:37:38 +00:00
Jenkins
939fa382fa Merge "Fix UnicodeDecodeError in reconstructor _full_path function" 2017-04-19 17:58:40 +00:00
Jenkins
88bca22549 Merge "Follow up tests for get_hashes regression" 2017-04-19 17:32:54 +00:00
Romain LE DISEZ
091157fc7f Fix encoding issue in ssync_sender.send_put()
EC object metadata can currently have a mixture of bytestrings and
unicode.  The ssync_sender.send_put() method raises an
UnicodeDecodeError when it attempts to concatenate the metadata
values, if any bytestring has non-ascii characters.

The root cause of this issue is that the object server uses unicode
for the keys of some object metadata items that are received in the
footer of an EC PUT request, whereas all other object metadata keys
and values are persisted as bytestrings.

This patch fixes the bug by changing diskfile write_metadata()
function to encode all unicode metadata keys and values as utf8
encoded bytes before writing to disk. To cope with existing objects
that have a mixture of unicode and bytestring metadata, the diskfile
read_metadata() function is also changed so that all returned unicode
metadata keys and values are utf8 encoded. This ensures that
ssync_sender.send_put() (and any other caller of diskfile
read_metadata) only reads bytestrings from object metadata.

Closes-Bug: #1678018
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: Ic23c55754ee142f6f5388dcda592a3afc9845c39
2017-04-19 18:05:52 +01:00
Clay Gerrard
b41f47f0e0 Follow up tests for get_hashes regression
IMHO we shouldn't ever trust the invalidations file so much we try to
skip a listdir when creating a hashes.pkl for the first time.  There may
be some subtle races looking back on the related patch, and it's related
patches.

This just makes some assertions to help demonstrate we should maintain
the invariant of setting hashes to valid via listdir.

Change-Id: I767e34a405de7911e9596e038e58a9a29f57a8f8
Related-Change-Id: I08c8cf09282f737103e580c1f57923b399abe58c
2017-04-19 12:03:15 +01:00
Alistair Coles
c740447de5 Move EC-specific unit test to EC Test class
The refactoring in the Related-Change separated EC
specific object controller tests into EC specific TestCase
classes, but left two EC specific tests in the Replication
object controller test class. This patch moves them to the
appropriate test class.

Previously the tests were only executed once, now they are
executed in each of two subclasses using different EC
policies. As a result it was necessary to make the test
container name unique to the policy under test.

Related-Change: Ifd3d0fa66773e640bb61cc528f7a1b2358e97d91

Change-Id: Ie712ea91b5dd74c504a0dd6aa40c3d657277108c
2017-04-19 10:39:43 +01:00
Kota Tsuyuzaki
381640cf90 Fix (un)patch_policies
Due to the refactoring of TestObjectController (related-change),
all of BaseTestECObjectController test methods are not being
needed to be unpatched because they are expected to run for test
setup-ed policies.

This patch works for items as follows:

- Move part of setUp/tearDown routines at BaseTestObjectController
  needed by only TestReplicatedObjectController which affects
  patch_policies

- Remove all unpatch_policies from BaseTestECObjectController

- Set up self.ec_policy to avoid to set policy index and
  retrieve the policy for each test method.

The reason why I didn't squash this up to the related parent patch is
to clarify what was changed at those patches. The parent is for just
clustering the tests for each test class and this one attempts to
improve.

Related-Change: Idd155401982a2c48110c30b480966a863f6bd305
Change-Id: I25a3f8fc837706d78dca226fe282d9e5ead65a0d
2017-04-18 23:30:39 -07:00
Alistair Coles
83750cf79c Fix UnicodeDecodeError in reconstructor _full_path function
Object paths can have non-ascii characters. Device dicts will
have unicode values. Forming a string using both will cause the
object path to be coerced to UTF8, which currently causes a
UnicodeDecodeError. This causes _get_response() to not return
and the recosntructor hangs.

The call to _full_path() is moved outside of _get_response()
(where its result is used in the exception handler logging)
so that _get_response() will always return even if _full_path()
raises an exception.

Unit tests are refactored to split out a new class with those
tests using an object name and the _full_path method, so that
the class can be subclassed to use an object name with non-ascii
characters.

Existing probe tests are subclassed to repeat using non-ascii
chars in object paths.

Change-Id: I4c570c08c770636d57b1157e19d5b7034fd9ed4e
Closes-Bug: 1679175
2017-04-18 14:07:01 +01:00
Jenkins
9627bcbc94 Merge "Fixed get ring name from recon cli" 2017-04-18 10:25:59 +00:00
Jenkins
ae3080b768 Merge "TestObjectController refactoring" 2017-04-18 01:44:33 +00:00
lijunbo
6788bf4aad Fixed get ring name from recon cli
This patch uses a more specific string to match
the ring name in recon cli.

Almost all the places in the project where need to get the
suffix (like ring.gz, ts, data and .etc) will include the '.'
in front, if a file named 'object.sring.gz' in the swift_dir
will be added in the ring_names, which is not what we want.

Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Closes-Bug: #1680704
Change-Id: Ida659fa71585f9b0cf36da75b58b28e6a25533df
2017-04-13 07:01:03 +00:00
Pavel Kvasnička
bcd0eb70af Container drive error results double space usage on rest drives
When drive with container or account database is unmounted
replicator pushes database to handoff location. But this
handoff location finds replica with unmounted drive and
pushes database to the *next* handoff until all handoffs has
a replica - all container/account servers has replicas of
all unmounted drives.

This patch solves:
- Consumption of iterator on handoff location that results in
  replication to the next and next handoff.
- StopIteration exception stopped not finished loop over
  available handoffs if no more nodes exists for db replication
  candidency.

Regression was introduced in 2.4.0 with rsync compression.

Co-Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>

Change-Id: I344f9daaa038c6946be11e1cf8c4ef104a09e68b
Closes-Bug: 1675500
2017-04-11 09:49:59 +02:00
Jenkins
a22208043f Merge "Modify _get_hashes() arguments to be more generic" 2017-04-10 22:50:11 +00:00
Jenkins
b3e69acb43 Merge "Fix race when consolidating new partition" 2017-04-08 00:55:23 +00:00
Clay Gerrard
a0fcca1e05 Do not sync suffixes when remote rejects reconstructor revert
SSYNC is designed to limit concurrent incoming connections in order to
prevent IO contention.  The reconstructor should expect remote
replication servers to fail ssync_sender when the remote is too busy.
When the remote rejects SSYNC - it should avoid forcing additional IO
against the remote with a REPLICATE request which causes suffix
rehashing.

Suffix rehashing via REPLICATE verbs takes two forms:

1) a initial pre-flight call to REPLICATE /dev/part will cause a remote
primary to rehash any invalid suffixes and return a map for the local
sender to compare so that a sync can be performed on any mis-matched
suffixes.

2) a final call to REPLICATE /dev/part/suf1-suf2-suf3[-sufX[...]] will
cause the remote primary to rehash the *given* suffixes even if they are
*not* invalid.  This is a requirement for rsync replication because
after a suffix is synced via rsync the contents of a suffix dir will
likely have changed and the remote server needs to update it hashes.pkl
to reflect the new data.

SSYNC does not *need* to send a post-sync REPLICATE request.  Any
suffixes that are modified by the SSYNC protocol will call _finalize_put
under the hood as it is syncing.  It is however not harmful and
potentially useful to go ahead refresh hashes after an SSYNC while the
inodes of those suffixes are warm in the cache.

However, that only makes sense if the SSYNC conversation actually synced
any suffixes - if SSYNC is rejected for concurrency before it ever got
started there is no value in the remote performing a rehash.  It may be
that *another* reconstructor is pushing data into that same partition
and the suffixes will become immediately invalidated.

If a ssync_sender does not successful finish a sync the reconstructor
should skip the REPLICATE call entirely and move on to the next
partition without causing any useless remote IO.

Closes-Bug: #1665141

Change-Id: Ia72c407247e4525ef071a1728750850807ae8231
2017-04-06 17:37:34 +01:00
Clay Gerrard
88ebcafbb9 Fix intermittent test_unlink_* failures
Change-Id: Iab403724a418e5d8a44e56e58da782bc66eab6e4
Closes-Bug: #1579578
2017-03-29 22:30:54 +00:00
Alexandre Lécuyer
95905b0174 Modify _get_hashes() arguments to be more generic
Some public functions in the diskfile manager expect or return full
file paths. It implies a filesystem diskfile implementation.
To make it easier to plug alternate diskfile implementations, patch
functions to take more generic arguments.

This commit changes DiskFileManager _get_hashes() arguments from:
  - partition_path, recalculate=None, do_listdir=False
to :
  - device, partition, policy, recalculate=None, do_listdir=False

Callers are modified accordingly, in diskfile.py, reconstructor.py,
and replicator.py

Change-Id: I8e2d7075572e466ae2fa5ebef5e31d87eed90fec
2017-03-29 14:57:40 +02:00
Jenkins
ca958317f0 Merge "Test that Manager.reload does stop/start in that order" 2017-03-29 08:26:08 +00:00
Jenkins
6cbf1c52e1 Merge "Remove unused returned value object_path from yield_hashes()" 2017-03-28 06:19:38 +00:00