This is an alternative approach to that proposed in [1]
Adds support for optional per-policy config sections
to be added in proxy-server.conf. This is highly desirable
to allow per-policy affinity options to be set for use with
duplicated EC policies [2] and composite rings [3].
Certain options found in per-policy conf sections will
override their equivalents that may be set in the
[app:proxy-server] section. Currently the options
handled that way are:
sorting_method
read_affinity
write_affinity
write_affinity_node_count
For example:
[proxy-server:policy:0]
sorting_method = affinity
read_affinity = r1=100
write_affinity = r1
write_affinity_node_count = 1 * replicas
The corresponding attributes of the proxy-server Application
are now available from instances of an OverrideConf object
that is obtained from Application.get_policy_options(policy).
[1] Related-Change: I9104fc789ba85ab3ab5ccd34096125b482821389
[2] Related-Change: Idd155401982a2c48110c30b480966a863f6bd305
[3] Related-Change: I0d8928b55020592f8e75321d1f7678688301d797
Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Change-Id: I3f718f425f525baa80045ba067950c752bcaaefc
Add entries for these options in the deployment guide and
make the text in proxy-server.conf-sample and man page
consistent.
Change-Id: I5854ddb3e5864ddbeaf9ac2c930bfafdb47517c3
Deprecate swift-temp-url and call python-swiftclient's
implementation instead. This adds python-swiftclient as an
optional dependency of Swift which is noted in releasenotes.
Change-Id: I0404f16c21099cb7695430f5b63722729c305613
Add support for a 2+1 EC policy to be optionally used as default
policy when running in process functional tests.
The EC policy may be selected by setting the env var:
SWIFT_TEST_IN_PROCESS_CONF_LOADER=ec tox
when running .functests, or by using the new tox test env:
tox -e func-ec
Change-Id: I02e3553a74a024efdab91dcd609ac1cf4e4f3208
SAIO docs do suggest using Ubuntu 14.04, but if using
16.04 then systemctl needs to be used to have rsync service
restart on reboot.
Change-Id: I4fb0d3d063df61fbdfca981f06911148f3c4dc04
Layout the foundation for documenting the features which will enable
Global EC.
The formatting on the sections in our existing EC docs didn't follow
best practices [1] and it caused some sphinx build warnings.
1. http://www.sphinx-doc.org/en/stable/rest.html#sections
Change-Id: I2d164dafeb84629c75c3c2ff774329ee84270b7f
An operator proposing a web UX to its customers might want to allow web
browser to access some headers by default (eg: X-Storage-Policy,
X-Container-Read, ...). This commit adds a new setting to the
proxy-server to allow some headers to be added cluster-wide to the CORS
header Access-Control-Expose-Headers.
Change-Id: I5ca90a052f27c98a514a96ee2299bfa1b6d46334
This patch enables efficent PUT/GET for global distributed cluster[1].
Problem:
Erasure coding has the capability to decrease the amout of actual stored
data less then replicated model. For example, ec_k=6, ec_m=3 parameter
can be 1.5x of the original data which is smaller than 3x replicated.
However, unlike replication, erasure coding requires availability of at
least some ec_k fragments of the total ec_k + ec_m fragments to service
read (e.g. 6 of 9 in the case above). As such, if we stored the
EC object into a swift cluster on 2 geographically distributed data
centers which have the same volume of disks, it is likely the fragments
will be stored evenly (about 4 and 5) so we still need to access a
faraway data center to decode the original object. In addition, if one
of the data centers was lost in a disaster, the stored objects will be
lost forever, and we have to cry a lot. To ensure highly durable
storage, you would think of making *more* parity fragments (e.g.
ec_k=6, ec_m=10), unfortunately this causes *significant* performance
degradation due to the cost of mathmetical caluculation for erasure
coding encode/decode.
How this resolves the problem:
EC Fragment Duplication extends on the initial solution to add *more*
fragments from which to rebuild an object similar to the solution
described above. The difference is making *copies* of encoded fragments.
With experimental results[1][2], employing small ec_k and ec_m shows
enough performance to store/retrieve objects.
On PUT:
- Encode incomming object with small ec_k and ec_m <- faster!
- Make duplicated copies of the encoded fragments. The # of copies
are determined by 'ec_duplication_factor' in swift.conf
- Store all fragments in Swift Global EC Cluster
The duplicated fragments increase pressure on existing requirements
when decoding objects in service to a read request. All fragments are
stored with their X-Object-Sysmeta-Ec-Frag-Index. In this change, the
X-Object-Sysmeta-Ec-Frag-Index represents the actual fragment index
encoded by PyECLib, there *will* be duplicates. Anytime we must decode
the original object data, we must only consider the ec_k fragments as
unique according to their X-Object-Sysmeta-Ec-Frag-Index. On decode no
duplicate X-Object-Sysmeta-Ec-Frag-Index may be used when decoding an
object, duplicate X-Object-Sysmeta-Ec-Frag-Index should be expected and
avoided if possible.
On GET:
This patch inclues following changes:
- Change GET Path to sort primary nodes grouping as subsets, so that
each subset will includes unique fragments
- Change Reconstructor to be more aware of possibly duplicate fragments
For example, with this change, a policy could be configured such that
swift.conf:
ec_num_data_fragments = 2
ec_num_parity_fragments = 1
ec_duplication_factor = 2
(object ring must have 6 replicas)
At Object-Server:
node index (from object ring): 0 1 2 3 4 5 <- keep node index for
reconstruct decision
X-Object-Sysmeta-Ec-Frag-Index: 0 1 2 0 1 2 <- each object keeps actual
fragment index for
backend (PyEClib)
Additional improvements to Global EC Cluster Support will require
features such as Composite Rings, and more efficient fragment
rebalance/reconstruction.
1: http://goo.gl/IYiNPk (Swift Design Spec Repository)
2: http://goo.gl/frgj6w (Slide Share for OpenStack Summit Tokyo)
Doc-Impact
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: Idd155401982a2c48110c30b480966a863f6bd305
Make development guidelines consistent with recent
changes to tox envs, specifically the removal of
"-in-process" from some test env names [1] and the
removal of the "func-fast-post" test env [2].
[1] Related-Change: I02477d81b836df71780942189d37d616944c4dce
[2] Related-Change: I6faf8fcfa0a1d96aaf0f5e0ad2106b2b416da22f
Change-Id: I08b92c005ee50beff09a92b4331dd7dbeed79bde
With this commit, the tempurl middleware accepts (besides
the traditional unix timestamps) also timestamps according
to the format '%Y-%m-%dT%H:%M:%SZ' (one acceptable form of ISO 8601).
The idea is to make the tempurls more user-friendly,
and has been formulated here:
Change-Id: I346a0241060a9559d178b30e60c957792bbeb9f0
Implements: blueprint human-readable-tempurl-timestamp
Let's remove the reference to swift-temp-url since
it has been deprecated and add a link to the swift client.
Change-Id: I70d64bf90f23a0f48b238ae6a99ab86f87d028a1
Signed-off-by: Thiago da Silva <thiago@redhat.com>
The Install Guides for Kilo and older are not published anymore, all
these links give a 404. Remove the links for these outdated versions.
Change-Id: I73e96b0ca4840d83c4f3561b4ffb08510d61c8d7
The reclaim_age is a DiskFile option, it doesn't make sense for two
different object services or nodes to use different values.
I also driveby cleanup the reclaim_age plumbing from get_hashes to
cleanup_ondisk_files since it's a method on the Manager and has access
to the configured reclaim_age. This fixes a bug where finalize_put
wouldn't use the [DEFAULT]/object-server configured reclaim_age - which
is normally benign but leads to weird behavior on DELETE requests with
really small reclaim_age.
There's a couple of places in the replicator and reconstructor that
reach into their manager to borrow the reclaim_age when emptying out
the aborted PUTs that failed to cleanup their files in tmp - but that
timeout doesn't really need to be coupled with reclaim_age and that
method could have just as reasonably been implemented on the Manager.
UpgradeImpact: Previously the reclaim_age was documented to be
configurable in various object-* services config sections, but that did
not work correctly unless you also configured the option for the
object-server because of REPLICATE request rehash cleanup. All object
services must use the same reclaim_age. If you require a non-default
reclaim age it should be set in the [DEFAULT] section. If there are
different non-default values, the greater should be used for all object
services and configured only in the [DEFAULT] section.
If you specify a reclaim_age value in any object related config you
should move it to *only* the [DEFAULT] section before you upgrade. If
you configure a reclaim_age less that your consistency window you are
likely to be eaten by a Grue.
Closes-Bug: #1626296
Change-Id: I2b9189941ac29f6e3be69f76ff1c416315270916
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
With a multipart-manifest PUT request, if client sends the md5 of the
segments' etags, a 422 Unprocessable Entity response is returned. This
patch fixes that and confirms the etag
Change-Id: I4598a2a3f16ca8727bb07bbb6d8efcfcae777796
Closes-Bug: #1213200
Co-Authored-By: Tim Burke <tim@swiftstack.com>
Previously, we still required that clients send "etag" and "size_bytes"
keys in their segment definitions. This was done as a way to guard
against typos leading to an accidental lack of verification.
However, typos should already be caught when we check for extra keys. As
a result, the only truly required key is "path".
Change-Id: Ie1d8691115f8c68b5a3f3b59317cdab59f9a3fca
The middleware now allows the usage of signatures with a prefix-based
scope. A prefix-based signature grants access to all objects which share
the same prefix. This avoids the creation of a large amount of signatures,
when a whole container or pseudofolder is shared.
Please see spec: https://review.openstack.org/#/c/199607/
Change-Id: I03b68eb74dae6196b5e63e711ef642ff7d2cfdc9
Change [1] requires pip >= 7.0.1. In general, test
environments will need to have pip and virtualenv in line
with global requirements. Add a reference to the global
requirements rather than specifying versions that will
inevitably become stale.
Also adds inline literal markup to all occurences of "tox"
for consistency.
[1] Related-Change: I7c60be623b4340ee34ae1aa520f17b303348811d
Change-Id: I7730f22c594a3521973cb9ff264a7d50f2b86a1a
Depends-On: I45eee626713438af5fc676f2b5f636d7ec23f7be