24 Commits

Author SHA1 Message Date
Jenkins
169d1d8ab8 Merge "Require that known-bad EC schemes be deprecated" 2017-06-22 01:11:03 +00:00
Tim Burke
2c3ac543f4 Require that known-bad EC schemes be deprecated
We said we were going to do it, we've had two releases saying we'd do
it, we've even backported our saying it to Newton -- let's actually do
it.

Upgrade Consideration
=====================

Erasure-coded storage policies using isa_l_rs_vand and nparity >= 5 must
be configured as deprecated, preventing any new containers from being
created with such a policy. This configuration is known to harm data
durability. Any data in such policies should be migrated to a new
policy. See https://bugs.launchpad.net/swift/+bug/1639691 for more
information.

UpgradeImpact
Related-Change: I50159c9d19f2385d5f60112e9aaefa1a68098313
Change-Id: I8f9de0bec01032d9d9b58848e2a76ac92e65ab09
Closes-Bug: 1639691
2017-06-16 17:58:43 +00:00
Jenkins
4315093a28 Merge "More Global EC doc updates" 2017-06-13 21:13:07 +00:00
Clay Gerrard
4c7839d256 More Global EC doc updates
Soften the language about inefficiency on read and strengthen the
language encouraging the use of read affinity and composite rings.

Change-Id: Idc81a8c71e74ae28d384759700c5268d77ae3c85
2017-06-13 10:08:20 +01:00
Jenkins
41c8f1330f Merge "Update Global EC docs with reference to composite rings" 2017-06-13 06:26:54 +00:00
Alistair Coles
9665252352 Update Global EC docs with reference to composite rings
* In light of the composite rings feature being added [1],
  downgrade the warnings about EC Duplication [2] being
  experimental.

* Add links from Global EC docs to composite rings and
  per-policy proxy config features.

* Add discussion of using EC duplication with composite
  rings.

* Update Known Issues.

[1] Related-Change: I0d8928b55020592f8e75321d1f7678688301d797
[2] Related-Change: Idd155401982a2c48110c30b480966a863f6bd305

Change-Id: Id97a4899255945a6eaeacfef12fd29a2580588df
2017-06-12 16:58:02 -07:00
Alistair Coles
37ba21face Add structure to storage policy configuration guide
The description of storage policy config options was
unstructured and repetitive. This patch attempts to
improve the doc by gathering the notes for each option
into a structured list.

Change-Id: I57090b35a70f365e82fb0e29ab42e533d6359a7b
2017-05-31 11:11:32 +01:00
Clay Gerrard
38b99ad195 Global EC Under Development Documentation
Layout the foundation for documenting the features which will enable
Global EC.

The formatting on the sections in our existing EC docs didn't follow
best practices [1] and it caused some sphinx build warnings.

1. http://www.sphinx-doc.org/en/stable/rest.html#sections

Change-Id: I2d164dafeb84629c75c3c2ff774329ee84270b7f
2017-03-07 15:25:54 +00:00
Tim Burke
2ca303597e Make Sphinx treat warnings as errors
...and fix up the one warning that's crept in.

Change-Id: I3985d027f0ac2119ceaeb4daba5964f937de6cea
2017-03-06 23:55:40 +00:00
Kota Tsuyuzaki
40ba7f6172 EC Fragment Duplication - Foundational Global EC Cluster Support
This patch enables efficent PUT/GET for global distributed cluster[1].

Problem:
Erasure coding has the capability to decrease the amout of actual stored
data less then replicated model. For example, ec_k=6, ec_m=3 parameter
can be 1.5x of the original data which is smaller than 3x replicated.
However, unlike replication, erasure coding requires availability of at
least some ec_k fragments of the total ec_k + ec_m fragments to service
read (e.g. 6 of 9 in the case above). As such, if we stored the
EC object into a swift cluster on 2 geographically distributed data
centers which have the same volume of disks, it is likely the fragments
will be stored evenly (about 4 and 5) so we still need to access a
faraway data center to decode the original object. In addition, if one
of the data centers was lost in a disaster, the stored objects will be
lost forever, and we have to cry a lot. To ensure highly durable
storage, you would think of making *more* parity fragments (e.g.
ec_k=6, ec_m=10), unfortunately this causes *significant* performance
degradation due to the cost of mathmetical caluculation for erasure
coding encode/decode.

How this resolves the problem:
EC Fragment Duplication extends on the initial solution to add *more*
fragments from which to rebuild an object similar to the solution
described above. The difference is making *copies* of encoded fragments.
With experimental results[1][2], employing small ec_k and ec_m shows
enough performance to store/retrieve objects.

On PUT:

- Encode incomming object with small ec_k and ec_m  <- faster!
- Make duplicated copies of the encoded fragments. The # of copies
  are determined by 'ec_duplication_factor' in swift.conf
- Store all fragments in Swift Global EC Cluster

The duplicated fragments increase pressure on existing requirements
when decoding objects in service to a read request.  All fragments are
stored with their X-Object-Sysmeta-Ec-Frag-Index.  In this change, the
X-Object-Sysmeta-Ec-Frag-Index represents the actual fragment index
encoded by PyECLib, there *will* be duplicates.  Anytime we must decode
the original object data, we must only consider the ec_k fragments as
unique according to their X-Object-Sysmeta-Ec-Frag-Index.  On decode no
duplicate X-Object-Sysmeta-Ec-Frag-Index may be used when decoding an
object, duplicate X-Object-Sysmeta-Ec-Frag-Index should be expected and
avoided if possible.

On GET:

This patch inclues following changes:
- Change GET Path to sort primary nodes grouping as subsets, so that
  each subset will includes unique fragments
- Change Reconstructor to be more aware of possibly duplicate fragments

For example, with this change, a policy could be configured such that

swift.conf:
ec_num_data_fragments = 2
ec_num_parity_fragments = 1
ec_duplication_factor = 2
(object ring must have 6 replicas)

At Object-Server:
node index (from object ring):  0 1 2 3 4 5 <- keep node index for
                                               reconstruct decision
X-Object-Sysmeta-Ec-Frag-Index: 0 1 2 0 1 2 <- each object keeps actual
                                               fragment index for
                                               backend (PyEClib)

Additional improvements to Global EC Cluster Support will require
features such as Composite Rings, and more efficient fragment
rebalance/reconstruction.

1: http://goo.gl/IYiNPk (Swift Design Spec Repository)
2: http://goo.gl/frgj6w (Slide Share for OpenStack Summit Tokyo)

Doc-Impact

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: Idd155401982a2c48110c30b480966a863f6bd305
2017-02-22 10:56:13 -08:00
Tim Burke
13f1fc0885 Clean up EC overview docs a bit
Change-Id: I3bab2c015c63f32dcd6e4beefbcd0fcf22e91eec
2017-01-30 23:30:35 +00:00
Alistair Coles
b13b49a27c EC - eliminate .durable files
Instead of using a separate .durable file to indicate
the durable status of a .data file, rename the .data
to include a durable marker in the filename. This saves
one inode for every EC fragment archive.

An EC policy PUT will, as before, first rename a temp
file to:

   <timestamp>#<frag_index>.data

but now, when the object is committed, that file will be
renamed:

   <timestamp>#<frag_index>#d.data

with the '#d' suffix marking the data file as durable.

Diskfile suffix hashing returns the same result when the
new durable-data filename or the legacy durable file is
found in an object directory. A fragment archive that has
been created on an upgraded object server will therefore
appear to be in the same state, as far as the consistency
engine is concerned, as the same fragment archive created
on an older object server.

Since legacy .durable files will still exist in deployed
clusters, many of the unit tests scenarios have been
duplicated for both new durable-data filenames and legacy
durable files.

Change-Id: I6f1f62d47be0b0ac7919888c77480a636f11f607
2016-10-10 18:11:02 +01:00
Luong Anh Tuan
53aebba903 Fix a typo in documentation
remove redundant 'this'

Change-Id: I8860190d882b255a3d416de685f930d2b2c0ad17
2016-10-04 10:07:10 +07:00
Alistair Coles
44a861787a Enable object server to return non-durable data
This patch improves EC GET response handling:

- The proxy no longer requires all object servers to have a
  durable file for the fragment archive that they return in
  response to a GET. The proxy will now be satisfied if just
  one object server has a durable file at the same timestamp
  as fragments from other object servers.

  This means that the proxy can now successfully GET an
  object that had missing durable files when it was PUT.

- The proxy will now ensure that it has a quorum of *unique*
  fragment indexes from object servers before considering a
  GET to be successful.

- The proxy is now able to fetch multiple fragment archives
  having different indexes from the same node. This enables
  the proxy to successfully GET an object that has some
  fragments that have landed on the same node, for example
  after a rebalance.

This new behavior is facilitated by an exchange of new
headers on a GET request and response between the proxy and
object servers.

An object server now includes with a GET (or HEAD) response:

- X-Backend-Fragments: the value of this describes all
  fragment archive indexes that the server has for the
  object by encoding a map of the form: timestamp -> <list
  of fragment indexes>

- X-Backend-Durable-Timestamp: the value of this is the
  internal form of the timestamp of the newest durable file
  that was found, if any.

- X-Backend-Data-Timestamp: the value of this is the
  internal form of the timestamp of the data file that was
  used to construct the diskfile.

A proxy server now includes with a GET request:

- X-Backend-Fragment-Preferences: the value of this
  describes the proxy's current preference with respect to
  those fragments that it would have object servers
  return. It encodes a list of timestamp, and for each
  timestamp a list of fragment indexes that the proxy does
  NOT require (because it already has them).

  The presence of a X-Backend-Fragment-Preferences header
  (even one with an empty list as its value) will cause the
  object server to search for the most appropriate fragment
  to return, disregarding the existence or not of any
  durable file. The object server assumes that the proxy
  knows best.

Closes-Bug: 1469094
Closes-Bug: 1484598

Change-Id: I2310981fd1c4622ff5d1a739cbcc59637ffe3fc3
Co-Authored-By: Paul Luse <paul.e.luse@intel.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
2016-09-16 11:40:14 +01:00
Thiago da Silva
886fa0822a update pyeclib and liberasurecode links
Change-Id: Ic6d04083618362778363fea1570caaa865e44557
Signed-off-by: Thiago da Silva <thiago@redhat.com>
2016-06-02 23:03:26 -04:00
Christian Schwede
043fbca6d0 Remove Erasure Coding beta status from docs
This removes notes stating support for Erasure coding as beta. Questions
regarding the stability of EC are coming up regularly, and are often referring
to the docs that state EC as still in beta.

Besides this, a note marking statsd support as beta has been removed as well.

Change-Id: If4fb6a5c4cb741d42953db3cee8cb17a1d774e15
2016-03-04 14:27:23 +00:00
Kazuhiro MIYAHARA
c3201f256c Remove execute permissions from doc files and swift.conf-sample
Some doc files and swift.conf-sample were added execute permissions in past changes.
This patch remove execute permissions from them.

Change-Id: Id8844989a8321578e9207566ebd6660f5b9523f0
2016-02-18 08:52:03 +00:00
Ondřej Nový
16976a0f14 Changed EC backend from jerasure to liberasurecode in examples and docs
liberasurecode_rs_vand is build-in liberasurecode, so you don't need
another depedency libjerasure2.

liberasurecode_rs_vand is supported by pyeclib from 1.0.8
version, so bumping version up.

Closes-Bug: #1534325
Change-Id: If2d96875694df8fd48c5278395859aaa165cb566
2016-02-02 23:08:11 +01:00
Bill Huber
0bcd7fd50e Update Erasure Coding Overview doc to remove Beta version
The major functionality of EC has been released for Liberty and
the beta version of the code has been removed since it is now
in production.

Change-Id: If60712045fb1af803093d6753fcd60434e637772
2015-12-18 11:43:12 -06:00
Alistair Coles
01f9d15045 Fix EC documentation of .durable quorum
Update the doc to reflect the change [1] to ndata + 1
.durable files being committed before a success response
is returned for a PUT.

[1] Ifd36790faa0a5d00ec79c23d1f96a332a0ca0f0b

Change-Id: I1744d457bda8a52eb2451029c4031962e92c2bb7
2015-10-08 18:55:29 +01:00
Alistair Coles
167f3c8cbd Update EC overview doc for PUT path
Update the EC overview docs 'under the hood' section to reflect the
change in durable file parity from 2 to ec_nparity + 1 [1].

Also fix some typos and cleanup the text.

[1] change id I80d666f61273e589d0990baa78fd657b3470785d

Change-Id: I23f6299da59ba8357da2bb5976d879d9a4bb173e
2015-09-30 09:45:57 +01:00
Bill Huber
530102ae07 Update EC Support on how to build an EC ring with replicas count
This doc is being updated to specify the replicas count parameter
to build an EC ring that enforces both data and parity placements
for each partition.

Change-Id: I770ad268e4017e610be3357e89b89f0b7d3c18af
Closes-Bug: 1487203
2015-09-15 09:40:39 +01:00
Atsushi SAKAI
964869accc Fix six typos on swift documentation
mechanisim => mechanism
    http://docs.openstack.org/developer/swift/cors.html
overridde => override
   http://docs.openstack.org/developer/swift/deployment_guide.html
extentsions => extensions
  http://docs.openstack.org/developer/swift/development_ondisk_backends.html
reuqest => request
  http://docs.openstack.org/developer/swift/logs.html
suport => support
  http://docs.openstack.org/developer/swift/overview_architecture.html
mininum => minimum
  http://docs.openstack.org/developer/swift/overview_erasure_code.html

$ git diff | diffstat
 cors.rst | 2 +-
 deployment_guide.rst | 2 +-
 development_ondisk_backends.rst | 2 +-
 logs.rst | 2 +-
 overview_architecture.rst | 2 +-
 overview_erasure_code.rst | 2 +-
 6 files changed, 6 insertions(+), 6 deletions(-)

Change-Id: I8e095f4c216b2cfae48dff1e17d387048349f73c
Closes-Bug: #1477877
2015-07-24 17:11:49 +09:00
Paul Luse
8f5d4d2455 Erasure Code Documentation
This patch adds all the relevant EC documentation to
the source tree. Notable additions are:
  - Updated SAIO documentation
  - Updates to existing swift documentation; and
  - Erasure Coding overview

Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: John Dickinson <me@not.mn>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
Change-Id: I0403016a4bb7dad9535891632753b0e5e9d402eb
Implements: blueprint swift-ec
Signed-off-by: Thiago da Silva <thiago@redhat.com>
2015-04-14 00:52:17 -07:00