With a sufficiently undispersed ring it's possible to move an entire
replicas worth of parts and yet the value of dispersion may not get any
better (even though in reality dispersion has dramatically improved).
The problem is dispersion will currently only represent up to one whole
replica worth of parts being undispersed.
However with EC rings it's possible for more than one whole replicas
worth of partitions to be undispersed, in these cases the builder will
require multiple rebalance operations to fully disperse replicas - but
the dispersion value should improve with every rebalance.
N.B. with this change it's possible for rings with a bad dispersion
value to measure as having a significantly smaller dispersion value
after a rebalance (even though they may not have had their dispersion
change) because the total amount of bad dispersion we can measure has
been increased but we're normalizing within a similar range.
Closes-Bug: #1697543
Change-Id: Ifefff0260deac0c3e8b369a1e158686c89936686
Add a symbolic link ("symlink") object support to Swift. This
object will reference another object. GET and HEAD
requests for a symlink object will operate on the referenced object.
DELETE and PUT requests for a symlink object will operate on the
symlink object, not the referenced object, and will delete or
overwrite it, respectively.
POST requests are *not* forwarded to the referenced object and should
be sent directly. POST requests sent to a symlink object will
result in a 307 Error.
Historical information on symlink design can be found here:
https://github.com/openstack/swift-specs/blob/master/specs/in_progress/symlinks.rst.
https://etherpad.openstack.org/p/swift_symlinks
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: Janie Richling <jrichli@us.ibm.com>
Co-Authored-By: Kazuhiro MIYAHARA <miyahara.kazuhiro@lab.ntt.co.jp>
Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Change-Id: I838ed71bacb3e33916db8dd42c7880d5bb9f8e18
Signed-off-by: Thiago da Silva <thiago@redhat.com>
Currently, our integrity checking for objects is pretty weak when it
comes to object metadata. If the extended attributes on a .data or
.meta file get corrupted in such a way that we can still unpickle it,
we don't have anything that detects that.
This could be especially bad with encrypted etags; if the encrypted
etag (X-Object-Sysmeta-Crypto-Etag or whatever it is) gets some bits
flipped, then we'll cheerfully decrypt the cipherjunk into plainjunk,
then send it to the client. Net effect is that the client sees a GET
response with an ETag that doesn't match the MD5 of the object *and*
Swift has no way of detecting and quarantining this object.
Note that, with an unencrypted object, if the ETag metadatum gets
mangled, then the object will be quarantined by the object server or
auditor, whichever notices first.
As part of this commit, I also ripped out some mocking of
getxattr/setxattr in tests. It appears to be there to allow unit tests
to run on systems where /tmp doesn't support xattrs. However, since
the mock is keyed off of inode number and inode numbers get re-used,
there's lots of leakage between different test runs. On a real FS,
unlinking a file and then creating a new one of the same name will
also reset the xattrs; this isn't the case with the mock.
The mock was pretty old; Ubuntu 12.04 and up all support xattrs in
/tmp, and recent Red Hat / CentOS releases do too. The xattr mock was
added in 2011; maybe it was to support Ubuntu Lucid Lynx?
Bonus: now you can pause a test with the debugger, inspect its files
in /tmp, and actually see the xattrs along with the data.
Since this patch now uses a real filesystem for testing filesystem
operations, tests are skipped if the underlying filesystem does not
support setting xattrs (eg tmpfs or more than 4k of xattrs on ext4).
References to "/tmp" have been replaced with calls to
tempfile.gettempdir(). This will allow setting the TMPDIR envvar in
test setup and getting an XFS filesystem instead of ext4 or tmpfs.
THIS PATCH SIGNIFICANTLY CHANGES TESTING ENVIRONMENTS
With this patch, every test environment will require TMPDIR to be
using a filesystem that supports at least 4k of extended attributes.
Neither ext4 nor tempfs support this. XFS is recommended.
So why all the SkipTests? Why not simply raise an error? We still need
the tests to run on the base image for OpenStack's CI system. Since
we were previously mocking out xattr, there wasn't a problem, but we
also weren't actually testing anything. This patch adds functionality
to validate xattr data, so we need to drop the mock.
`test.unit.skip_if_no_xattrs()` is also imported into `test.functional`
so that functional tests can import it from the functional test
namespace.
The related OpenStack CI infrastructure changes are made in
https://review.openstack.org/#/c/394600/.
Co-Authored-By: John Dickinson <me@not.mn>
Change-Id: I98a37c0d451f4960b7a12f648e4405c6c6716808
An SLO PUT requires that we HEAD every referenced object; as a result, it
can be a very time-intensive operation. This makes it difficult as a
client to differentiate between a proxy-server that's still doing work and
one that's crashed but left the socket open.
Now, clients can opt-in to receiving heartbeats during long-running PUTs
by including the query parameter
heartbeat=on
With heartbeating turned on, the proxy will start its response immediately
with 202 Accepted then send a single whitespace character periodically
until the request completes. At that point, a final summary chunk will be
sent which includes a "Response Status" key indicating success or failure
and (if successful) an "Etag" key indicating the Etag of the resulting SLO.
This mechanism is very similar to the way bulk extractions and deletions
work, and even the way SLO behaves for ?multipart-manifest=delete requests.
Note that this is opt-in: this prevents us from sending the 202 response
to existing clients that may mis-interpret it as an immediate indication
of success.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Related-Bug: 1718811
Change-Id: I65cee5f629c87364e188aa05a06d563c3849c8f3
The overview_policies doc makes reference to an `alias` option when in
fact the option is `aliases`.
The sample storage policy snippet is correct, it's just incorrect when
listing the possible options.
This change changes the listed option to `aliases`.
Change-Id: Iddf0f19f4d50819ff6abd46e6a1156dc8e4a451d
This commit replaces boolean replication_one_per_device by an integer
replication_concurrency_per_device. The new configuration parameter is
passed to utils.lock_path() which now accept as an argument a limit for
the number of locks that can be acquired for a specific path.
Instead of trying to lock path/.lock, utils.lock_path() now tries to lock
files path/.lock-X, where X is in the range (0, N), N being the limit for
the number of locks allowed for the path. The default value of limit is
set to 1.
Change-Id: I3c3193344c7a57a8a4fc7932d1b10e702efd3572
It was deprecated and we discussed on this topic in Denver PTG
for Queen cycle. Main motivation for this work is that deprecated
post_as_copy option and its gate blocks future symlink work.
Change-Id: I411893db1565864ed5beb6ae75c38b982a574476
Update the doc link brought by the doc migration.
Although we had some effort to fix these, it still left lots of bad
doc link, I separate these changes into 3 patches aim to fix all of
these, this is the 2st patch for doc/manpages.
Change-Id: Id426c5dd45a812ef801042834c93701bb6e63a05
Update the doc link brought by the doc migration.
Although we had some effort to fix these, it still left lots of bad
doc link, I separate these changes into 3 patches aim to fix all of
these, this is the 3rd patch for doc/source/install.
Change-Id: I1b0c12cd5f893f1a84d12782ddc39f6d06beb2aa
- saio describes both 14.04 and 16.04 procedure
- currently we're testing on 16.04 (xenial) envrionment on the gate
Remaining task (probably another work):
- review the installation guide which adjusts to the ubuntu 14.04 LTS
Change-Id: Id690a1deabeb24bfc1af3ba3a3019794fe4b8eb9
If a number of DLO segments is larger than container listing limit,
Content-Length header will not be included in GET or HEAD response.
However, this fact is not explained in document of large objects.
This patch add explanation about this fact to the document.
Change-Id: Ia45fad05797f38fa8b6b0ed917b4f9d7fb337149
Closes-Bug: 1680219
This isn't actually used (and in swift is commented out already)
and is a leftover from a thing we did about seven years ago.
Change-Id: I9889bcfd29054f14679ae7430b077ad3afb25b98
This patch adds OpenSuse to the build a SAIO development page.
OpenSuse's libssl.so naming is different then other Linux distros
and as such it can't simply use pip's cryptography wheel/binary,
which by default is linked to libssl.so.10.
To fix this, --no-binary cryptography was added to pip install:
pip install --no-binary cryptography -r requirements.txt
Which forces the cryptography module's binding to be compiled
against the correct libssl.so library.
Change-Id: I6a070f33d670edbb887433530c44e2cb509f0c58
The use of a keystone role name in container ACLs is supported
and tested. This patch adds documentation.
[1] fb3d01a974/swift/common/middleware/keystoneauth.py (L491-L497)
[2] test.unit.common.middleware.test_keystoneauth.TestAuthorize.test_authorize_succeeds_for_user_role_in_roles
Change-Id: I77df27393a10f1d8c5a43161fdd4eb08be632566
Closes-Bug: #1705300
This patch adds support for retrieving the encryption root secret from
an external key management system. In practice, this is currently
limited to Barbican.
Change-Id: I1700e997f4ae6fa1a7e68be6b97539a24046e80b
* Fixes warnings in RST file
* Suppress warning log from pyeclib during the doc build.
pyeclib emits a warning message on an older liberasurecode [1]
and sphinx treats this as error (when warning-is-error is set).
There is no need to check warnings during the doc build,
so we can safely suppress the warning.
This is a part of the doc migration community-wide effort.
http://specs.openstack.org/openstack/docs-specs/specs/pike/os-manuals-migration.html
[1] https://github.com/openstack/pyeclib/commit/d163972b
Change-Id: I9adaee29185a2990cc3985bbe0dd366e22f4f1a2
This change adds a new Strategy concept to the daemon module similar to
how we manage WSGI workers. We need to leverage multiple python
processes to get the concurrency properties we need. More workers will
rebalance much faster on dense chassis with many devices.
Currently the default is still only one process, and no workers. Set
reconstructor_workers in the [object-reconstructor] section to some
whole number <= the number of devices on a node to get that many
reconstructor workers.
Each worker will operate on a different subset of disks.
Once mode works as before, but tends to want to update recon drops a
little bit more.
If you change the rings, the strategy will shutdown workers and spawn
new ones.
You can kill the worker pids and the daemon strategy will respawn them.
New per-disk reconstructor stats are dumped to recon under the
object_reconstruction_per_disk key. To maintain legacy compatibility
and replication monitoring based on cycle times they are aggregated
every stats_interval (default 5 mins).
Change-Id: I28925a37f3985c9082b5a06e76af4dc3ec813abe
Clarify in usage statement and man pages that CLI override options for
swift-object-reconstructor and swift-object-replicator only have
effect when --once is used.
Also add a link to object reconstructor source code docs to the doc
index page for consistency with the other object services.
Change-Id: If348b340d59a672d3a19d4df231ebdb74f4aed51
Previously it was hard to navigate to a particular config section in
the deployment guide, and not possible to provide a link directly to
one section.
This patch makes each config section a heading so that it appears in
navigation tables and can be easily linked to. A list of config
sections is also added at the start of each server section.
Change-Id: Iecb0637fde521600a9163fa66b3dbdc176a71dff
Related-Bug: #1626290