548 Commits

Author SHA1 Message Date
Alistair Coles
aa9afb5384 Test EC chunk_transformer with larger input chunks
The tests were lacking coverage for the chunk_transformer
reading multiple segment_size pieces from an input chunk.
This patch modifies test_chunk_transformer to exercise more
input chunk scenarios.

Also improve variable naming and comments in
_test_determine_chunk_destinations_prioritize

Change-Id: I4eb55ee3e87dae478828f7ccba86fec267492bd8
Related-Change: Ib9e8a6f67c2985164dd20b049c7f144f19fd1822
2017-03-17 09:19:47 +00:00
Kota Tsuyuzaki
a2f4046624 Small fixes for ec duplication
To address Alistair's comment at
https://review.openstack.org/#/c/219165.

This includes:

- Fix reconstructor log message to avoid redundant frag index info
- Fix incorrect FabricatedRing setting to have ec_k + ec_m replicas
- Use policy.ec_n_unique_fragments for testing frag index election
- Plus some various minor cleanup and docs additions

Huge refactoring around TestECMixin at the test/unit/proxy/test_server.py
is in https://review.openstack.org/#/c/440466/ to clarify the change.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>

Change-Id: I8096202f5f8d91296963f7a409a29d57fa7828e4
2017-03-16 21:59:56 -07:00
Alistair Coles
4f5c03c89d Fix intermittent EC GET test failure
test_GET_with_mixed_etags_at_same_timestamp in
test.unit.proxy.controllers.test_obj:TestECObjController
will fail intermittently when the randomly chosen bodies
of the two objects are identical. In conjunction with the
timestamps of the two objects being deliberately equal,
this makes the two objects identical.

Fix it by explicitly setting unique base strings for the
object bodies.

Change-Id: Idb1081edfa26b9f229f44b00c439cda33a7385fa
2017-03-16 16:56:32 +00:00
Jenkins
1e9b8888bf Merge "Enable cluster-wide CORS Expose-Headers setting" 2017-03-13 19:24:20 +00:00
Kota Tsuyuzaki
4187c839c3 Optimize ec duplication and its md5 hashing
Originally, ec duplication is designed to make duplicated copy at
chunk_transformer function for each node. However, it can cause doubled
(or more than, depends on ec_duplication_factor) etag calculation which
should be expensive for Swift proxy work.

Since this patch, chunk_transformer keeps only an unique fragments slot
to store, and then, send_chunk in ECObjectController._transfer_data
picks up the suitable chunk by unique backend fragment index assigned by
_determine_chunk_destination function to send to object server.

Note that, Putter still keeps node_index but the new putter_to_frag_index
dict and frag_hasher (chunk_index and chunk_hasher in the old names)
now refers the value by fragment index.

Change-Id: Ib9e8a6f67c2985164dd20b049c7f144f19fd1822
2017-03-08 01:12:09 -08:00
Jenkins
cf1c44dff0 Merge "Fixups for EC frag duplication tests" 2017-03-03 23:08:34 +00:00
Jenkins
1f36b5dd16 Merge "EC Fragment Duplication - Foundational Global EC Cluster Support" 2017-02-26 06:26:08 +00:00
Alistair Coles
e4972f5ac7 Fixups for EC frag duplication tests
Follow up for related change:
- fix typos
- use common helper methods
- refactor some tests to reduce duplicate code

Related-Change: Idd155401982a2c48110c30b480966a863f6bd305

Change-Id: I2f91a2f31e4c1b11f3d685fa8166c1a25eb87429
2017-02-25 20:40:04 -08:00
Romain LE DISEZ
9b47de3095 Enable cluster-wide CORS Expose-Headers setting
An operator proposing a web UX to its customers might want to allow web
browser to access some headers by default (eg: X-Storage-Policy,
 X-Container-Read, ...). This commit adds a new setting to the
proxy-server to allow some headers to be added cluster-wide to the CORS
header Access-Control-Expose-Headers.

Change-Id: I5ca90a052f27c98a514a96ee2299bfa1b6d46334
2017-02-25 19:00:28 +01:00
Jenkins
075c21a944 Merge "Add Vary: headers for CORS responses" 2017-02-23 01:45:29 +00:00
Kota Tsuyuzaki
40ba7f6172 EC Fragment Duplication - Foundational Global EC Cluster Support
This patch enables efficent PUT/GET for global distributed cluster[1].

Problem:
Erasure coding has the capability to decrease the amout of actual stored
data less then replicated model. For example, ec_k=6, ec_m=3 parameter
can be 1.5x of the original data which is smaller than 3x replicated.
However, unlike replication, erasure coding requires availability of at
least some ec_k fragments of the total ec_k + ec_m fragments to service
read (e.g. 6 of 9 in the case above). As such, if we stored the
EC object into a swift cluster on 2 geographically distributed data
centers which have the same volume of disks, it is likely the fragments
will be stored evenly (about 4 and 5) so we still need to access a
faraway data center to decode the original object. In addition, if one
of the data centers was lost in a disaster, the stored objects will be
lost forever, and we have to cry a lot. To ensure highly durable
storage, you would think of making *more* parity fragments (e.g.
ec_k=6, ec_m=10), unfortunately this causes *significant* performance
degradation due to the cost of mathmetical caluculation for erasure
coding encode/decode.

How this resolves the problem:
EC Fragment Duplication extends on the initial solution to add *more*
fragments from which to rebuild an object similar to the solution
described above. The difference is making *copies* of encoded fragments.
With experimental results[1][2], employing small ec_k and ec_m shows
enough performance to store/retrieve objects.

On PUT:

- Encode incomming object with small ec_k and ec_m  <- faster!
- Make duplicated copies of the encoded fragments. The # of copies
  are determined by 'ec_duplication_factor' in swift.conf
- Store all fragments in Swift Global EC Cluster

The duplicated fragments increase pressure on existing requirements
when decoding objects in service to a read request.  All fragments are
stored with their X-Object-Sysmeta-Ec-Frag-Index.  In this change, the
X-Object-Sysmeta-Ec-Frag-Index represents the actual fragment index
encoded by PyECLib, there *will* be duplicates.  Anytime we must decode
the original object data, we must only consider the ec_k fragments as
unique according to their X-Object-Sysmeta-Ec-Frag-Index.  On decode no
duplicate X-Object-Sysmeta-Ec-Frag-Index may be used when decoding an
object, duplicate X-Object-Sysmeta-Ec-Frag-Index should be expected and
avoided if possible.

On GET:

This patch inclues following changes:
- Change GET Path to sort primary nodes grouping as subsets, so that
  each subset will includes unique fragments
- Change Reconstructor to be more aware of possibly duplicate fragments

For example, with this change, a policy could be configured such that

swift.conf:
ec_num_data_fragments = 2
ec_num_parity_fragments = 1
ec_duplication_factor = 2
(object ring must have 6 replicas)

At Object-Server:
node index (from object ring):  0 1 2 3 4 5 <- keep node index for
                                               reconstruct decision
X-Object-Sysmeta-Ec-Frag-Index: 0 1 2 0 1 2 <- each object keeps actual
                                               fragment index for
                                               backend (PyEClib)

Additional improvements to Global EC Cluster Support will require
features such as Composite Rings, and more efficient fragment
rebalance/reconstruction.

1: http://goo.gl/IYiNPk (Swift Design Spec Repository)
2: http://goo.gl/frgj6w (Slide Share for OpenStack Summit Tokyo)

Doc-Impact

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: Idd155401982a2c48110c30b480966a863f6bd305
2017-02-22 10:56:13 -08:00
Jenkins
5084a63770 Merge "Let users know entity size in 416 responses" 2016-12-05 19:06:45 +00:00
Clay Gerrard
d37d077cb1 Add Status Code tests for Container GET
I think a real easy list of status maps checks is missing for the
container unittests.  At least I hope I didn't miss it?  This one uses
all some pretty decent modern infrastructure so it should be easy to
expand.

Change-Id: I0929dad87214569cfd5ee3896840a92cc10c621f
2016-12-02 16:10:29 -08:00
Tim Burke
e8a80e874a Let users know entity size in 416 responses
If a user sends a Range header with no satisfiable ranges, we send back
a 416 Requested Range Not Satisfiable response. Previously however,
there would be no indication of the size of the object they were
requesting, so they wouldn't know how to craft a satisfiable range. We
*do* send a Content-Length, but it is (correctly) the length of the
error message.

The RFC [1] has an answer for this:

>  A server generating a 416 (Range Not Satisfiable) response to a
>  byte-range request SHOULD send a Content-Range header field with an
>  unsatisfied-range value, as in the following example:
>
>    Content-Range: bytes */1234
>
>  The complete-length in a 416 response indicates the current length of
>  the selected representation.

Now, we'll send a Content-Range header for all 416 responses, including
those coming from the object server as well as those generated on a
proxy because of the Range mangling required to support EC policies.

[1] RFC 7233, section 4.2, although similar language was used in RFC
2616, sections 10.4.17 and 14.16

Change-Id: I80c7390fc6f84a10a212b0641bb07a64dfccbd45
2016-11-30 10:52:08 -08:00
Tim Burke
e8a5448b07 Add X-Openstack-Request-Id to Access-Control-Expose-Headers
Change-Id: Ib95a693042f0b3cf204033eb5957660cb3573dcf
Related-Change: I56cd4738808b99c0a08463f83c100be51a62db05
2016-11-16 12:39:12 -08:00
Thiago da Silva
d25216bdda remove double checks on account/container info
Continuing the clean up in account and container
info, removed duplicated validation from account_info
and container_info methods, since the same validations
were recently added to get_account_info and get_container_info.

Change-Id: I1ad745fe809367d22649c83f38c4de7a74cac44e
Signed-off-by: Thiago da Silva <thiago@redhat.com>
2016-11-11 10:46:17 -08:00
Alistair Coles
2a75091c58 Make ECDiskFileReader check fragment metadata
This patch makes the ECDiskFileReader check the validity of EC
fragment metadata as it reads chunks from disk and quarantine a
diskfile with bad metadata. This in turn means that both the object
auditor and a proxy GET request will cause bad EC fragments to be
quarantined.

This change is motivated by bug 1631144 which may result in corrupt EC
fragments being written to disk but appear valid to the object auditor
md5 hash and content-length checks.

NotImplemented:

 * perform metadata check when a read starts on any frag_size
   boundary, not just at zero

Related-Bug: #1631144
Closes-Bug: #1633647

Change-Id: Ifa6a7f8aaca94c7d39f4aeb9d4fa3f59c4f6ee13
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
2016-11-01 13:11:02 -07:00
Ondřej Nový
33c18c579e Remove executable flag from some test modules
Change-Id: I36560c2b54c43d1674b007b8105200869b5f7987
2016-10-31 21:22:10 +00:00
Alistair Coles
b13b49a27c EC - eliminate .durable files
Instead of using a separate .durable file to indicate
the durable status of a .data file, rename the .data
to include a durable marker in the filename. This saves
one inode for every EC fragment archive.

An EC policy PUT will, as before, first rename a temp
file to:

   <timestamp>#<frag_index>.data

but now, when the object is committed, that file will be
renamed:

   <timestamp>#<frag_index>#d.data

with the '#d' suffix marking the data file as durable.

Diskfile suffix hashing returns the same result when the
new durable-data filename or the legacy durable file is
found in an object directory. A fragment archive that has
been created on an upgraded object server will therefore
appear to be in the same state, as far as the consistency
engine is concerned, as the same fragment archive created
on an older object server.

Since legacy .durable files will still exist in deployed
clusters, many of the unit tests scenarios have been
duplicated for both new durable-data filenames and legacy
durable files.

Change-Id: I6f1f62d47be0b0ac7919888c77480a636f11f607
2016-10-10 18:11:02 +01:00
Jenkins
8526d4c5d2 Merge "Fix using filter() to meet python2,3" 2016-09-28 22:25:57 +00:00
Clay Gerrard
bfaa8e0583 Fix ChunkWriteError when running unittests
I don't think this is a real bug - just that the mocked iter wasn't
closing it subiters like the real iter does.

Change-Id: I44c8159f9eea8737bc86b6c7eb59a512e57e86c1
2016-09-21 17:33:30 -07:00
Luong Anh Tuan
19a684dded Fix using filter() to meet python2,3
As mentioned in link[1], if we need filter on python3,
Raplace filter(lambda obj: test(obj), data) with:
[obj for obj in data if test(obj)].

[1] https://wiki.openstack.org/wiki/Python3

Change-Id: Ia1ea2ec89e4beb957a4cb358b0d0cef970f23e0a
2016-09-22 07:32:38 +07:00
Jenkins
93b02c931f Merge "Turn on F812 check" 2016-09-19 16:35:17 +00:00
Tim Burke
a741998bff Turn on F812 check
F812 list comprehension redefines <variable> from line ...

While the current violations were benign, this sort of code can easily
lead to subtle bugs. Seems worth checking, especially given how cheap it
is to bring existing code in line with it.

Change-Id: Ibdcf9f93b85a1f1411198001df6bdbfa8f92d114
2016-09-16 14:44:37 -07:00
Alistair Coles
44a861787a Enable object server to return non-durable data
This patch improves EC GET response handling:

- The proxy no longer requires all object servers to have a
  durable file for the fragment archive that they return in
  response to a GET. The proxy will now be satisfied if just
  one object server has a durable file at the same timestamp
  as fragments from other object servers.

  This means that the proxy can now successfully GET an
  object that had missing durable files when it was PUT.

- The proxy will now ensure that it has a quorum of *unique*
  fragment indexes from object servers before considering a
  GET to be successful.

- The proxy is now able to fetch multiple fragment archives
  having different indexes from the same node. This enables
  the proxy to successfully GET an object that has some
  fragments that have landed on the same node, for example
  after a rebalance.

This new behavior is facilitated by an exchange of new
headers on a GET request and response between the proxy and
object servers.

An object server now includes with a GET (or HEAD) response:

- X-Backend-Fragments: the value of this describes all
  fragment archive indexes that the server has for the
  object by encoding a map of the form: timestamp -> <list
  of fragment indexes>

- X-Backend-Durable-Timestamp: the value of this is the
  internal form of the timestamp of the newest durable file
  that was found, if any.

- X-Backend-Data-Timestamp: the value of this is the
  internal form of the timestamp of the data file that was
  used to construct the diskfile.

A proxy server now includes with a GET request:

- X-Backend-Fragment-Preferences: the value of this
  describes the proxy's current preference with respect to
  those fragments that it would have object servers
  return. It encodes a list of timestamp, and for each
  timestamp a list of fragment indexes that the proxy does
  NOT require (because it already has them).

  The presence of a X-Backend-Fragment-Preferences header
  (even one with an empty list as its value) will cause the
  object server to search for the most appropriate fragment
  to return, disregarding the existence or not of any
  durable file. The object server assumes that the proxy
  knows best.

Closes-Bug: 1469094
Closes-Bug: 1484598

Change-Id: I2310981fd1c4622ff5d1a739cbcc59637ffe3fc3
Co-Authored-By: Paul Luse <paul.e.luse@intel.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
2016-09-16 11:40:14 +01:00
Jenkins
8608bd96dd Merge "Make object creation more atomic in Linux" 2016-09-13 04:02:47 +00:00
Tim Burke
1883c1ec23 Make get_info requests useful with recheck_*_existence == 0
Before, when recheck_account_existence or recheck_container_existence
was set to zero, get_info requests for accounts or containers wouldn't
populate the env cache for the current request, so it wouldn't return
the data *it just got*.

Now, we'll still populate the env cache and memcache, as a cache time of
zero means "keep it indefinitely". See the memcache docs at
https://github.com/memcached/memcached/blob/1.4.25/doc/protocol.txt#L163

Change-Id: Ia648263073bc88e35216cafb76821b7ce02c1d03
Closes-Bug: 1224734
2016-09-01 15:10:04 -07:00
Jenkins
c2f5e30c86 Merge "Silence "Client disconnected" warnings on reads." 2016-09-01 22:01:58 +00:00
Jenkins
6b07bcbf05 Merge "Tighten header checks for object PUT/POST paths" 2016-09-01 20:55:48 +00:00
Timur Alperovich
2825909d25 Silence "Client disconnected" warnings on reads.
When a client fully reads the content and closes the iterator, the
Client disconnected warning is still generated, as there is no logic
to check whether the GeneratorExit exception was raised after the
client received all of the data. This can be observed when doing
large object reads or using an InternalClient and reading exactly
Content-Length bytes from the returned app_iter body.

The patch amends the behavior to hoist how many bytes client read from
a given part and only raise an exception if there are more parts left
or a part was not fully read.

Lastly, the GeneratorExit exception is no longer swallowed and is
re-raised in the handling code.

Change-Id: I879149897fdb25aae977b7f17e580610b188ce04
2016-09-01 04:33:50 +00:00
Jenkins
0944753b37 Merge "Fix EC ring validation at ring reload" 2016-08-27 01:01:24 +00:00
Tim Burke
4d4885acdc Tighten header checks for object PUT/POST paths
Change-Id: If2cd059719fe5af1e73ecde5306e9f68d590831f
2016-08-25 17:39:00 -07:00
Prashanth Pai
773edb4a5d Make object creation more atomic in Linux
Linux 3.11 introduced O_TMPFILE as a flag to open() sys call. This would
enable users to get a fd to an unnamed temporary file. As it's unnamed,
it does not require the caller to devise unique names. It is also not
accessible through any path. Hence, file creation is race-free.

This file is initially unreachable. It is then populated with data(write),
metadata(fsetxattr) and fsync'd before being atomically linked into the
filesystem in a fully formed state using linkat() sys call. Only after a
successful linkat() will the object file will be available for reference.

Caveats
* Unlike os.rename(), linkat() cannot overwrite destination path if it
  already exists. If path exists, we unlink and try again.
* XFS support for O_TMPFILE was only added in Linux 3.15.
* If client disconnects during object upload, although there is no
  incomplete/stale file on disk, the object directory would persist
  and is not cleaned up immediately.

Change-Id: I8402439fab3aba5d7af449b5e465f89332f606ec
Signed-off-by: Prashanth Pai <ppai@redhat.com>
2016-08-24 14:56:00 +05:30
Tim Burke
3e46079546 Add Vary: headers for CORS responses
From the (non-normative) Implementation Considerations section of
https://www.w3.org/TR/cors/#resource-implementation :

> Resources that wish to enable themselves to be shared with multiple
> Origins but do not respond uniformly with "*" must in practice
> generate the Access-Control-Allow-Origin header dynamically in
> response to every request they wish to allow. As a consequence,
> authors of such resources should send a Vary: Origin HTTP header or
> provide other appropriate control directives to prevent caching of
> such responses, which may be inaccurate if re-used across-origins.

We do the first part (dynamic Access-Control-Allow-Origin: generation
based on the incoming Origin: header), but not the second (send a
Vary: Origin header). Consider this scenario:

 1. Swift user Alice has some static content that should be available
    from some (but not all) other domains. She creates a new container
    with an appropriate X-Container-Meta-Access-Control-Allow-Origin
    like "http://foo.example.com http://bar.example.com".

 2. End user Bob pulls up a browser and visits http://foo.example.com,
    which references a cross-origin resource. Seeing this, the browser
    issues a preflight request and gets back a response that includes
    headers like:

        Access-Control-Allow-Origin: http://foo.example.com
        Access-Control-Allow-Methods: HEAD, GET, PUT, POST, COPY,
         OPTIONS, DELETE

    Since the preflight succeeded, the browser follows through on the
    cross-origin request and everything loads properly.

 3. Now Bob visits http://bar.example.com, which references the same
    resource. Ordinarily, the exact same thing would happen, but with
    http://bar.example.com in the headers. However, if the browser
    cached the preflight response (because it didn't want to make two
    requests everytime it needed a resource), it would assume the server
    would only allow resource-sharing with http://foo.example.com and
    not load the resource.

Similar issues arise from the dynamically-generated
Access-Control-Allow-Headers header.

For more information on the Vary: header, see
http://tools.ietf.org/html/rfc7231#section-7.1.4

Change-Id: I9950e593312f654ee596b7f43f7ab9e5b684d8e5
2016-08-19 16:28:16 -07:00
Jenkins
1c74fbec02 Merge "Use more specific asserts in test/unit/proxy tests" 2016-08-19 03:54:49 +00:00
Jenkins
9d29ca1c76 Merge "Last-Modified header support on HEAD/GET container" 2016-08-11 14:44:12 +00:00
Rebecca Finn
aa2a84ba8a Check object metadata constraints after authorizing
In the object proxy controller, the POST method checked the metadata of an
object before calling swift.authorize. This could allow an auth middleware to
set metadata that violates constraints. Instead, checking the metadata should
take place after authorization.

Change-Id: I5f05039498c406473952e78c6a40ec11e8b53f8e
Closes-Bug: #1596944
2016-07-28 19:05:08 +00:00
Kota Tsuyuzaki
1eb96397e7 Fix EC ring validation at ring reload
Swift EC has a strong constraint about the ring must have a number of
replicas which fits ec_k + ec_m. That is validated when servers waking
up. However, Swift has more chance to load such an invalid ring when
a request comming, calling some node iteration like get_nodes,
get_part_nodes or so, and no ring validation is there.

This patch moves ring validation from policy validate_ring into the ring
instance as validation_hook that will run at ring reload. Since this patch,
ring instance will allow to use the old ring if the reload is not fourced.

Note that the exception if invalid ring found was changed from
RingValidationError to RingLoadError because RingValidationError is a
child of RingBuilderError but the ring reload is obviously outside of
"builder".

Closes-Bug: #1534572

Change-Id: I6428fbfb04e0c79679b917d5e57bd2a34f2a0875
2016-07-24 21:49:57 -07:00
Gábor Antal
75a58a6dd8 Use more specific asserts in test/unit/proxy tests
I changed asserts with more specific assert methods.
e.g.: from assertTrue(sth == None) to assertIsNone(*) or
assertTrue(isinstance(inst, type)) to assertIsInstace(inst, type)
or assertTrue(not sth) to assertFalse(sth).

The code gets more readable, and a better description will be shown on fail.

Change-Id: If6aad8681aab7c9a41d65a4f449d8abbe3e64616
2016-07-15 13:32:31 +00:00
Alistair Coles
ca2f6d13b6 Fix unicode errors in object controller logging
Change swift.proxy.server.Application.error_occurred()
to decode message as utf-8 in same way that the
exception_occurred() method was changed in [1].

This prevents a unicode error when logging error responses
in swift.proxy.controllers.base.Controller._make_request()
for paths that have non-ascii characters. Although the unicode
error is currently caught by a surrounding except clause, the
logging and error limiting treatment is different for ascii
vs non-ascii paths. This patch makes them consistent.

Fix the server type reported in _make_request() to be
the correct server type, not always 'Container Server'.

Fix path arg passed to _get_conn_response in
swift.proxy.controllers.obj.BaseObjectController to be req.path
rather than req.

Add unit tests for error_occurred() being called with non-ascii
paths and extend tests for exception_occurred() (see Related-Bug).

[1] Change-Id: Icb7284eb5abc9869c1620ee6366817112d8e5587

Related-Bug: #1597210
Change-Id: I285499d164bff94835bdddb25d2af6d73114c281
2016-07-07 13:50:17 +01:00
Brian Cline
7568ea5dd9 Prevent down nodes failing PUTs with non-ascii obj names
On an object PUT with a non-ascii name, if we hit some kind of
exception speaking to only one object-server of the N we try to
connect to, we try to log it -- but this causes an exception when
interpolating the UTF-8 encoded path iff the message template is
unicode.

Since this is essentially an exception within an exception handler,
this fails the entire request with a 500 error -- even though the
other nodes may have been just fine. This occurs before it attempts
a handoff node.

The simplest way to reproduce this is by running func tests against
a small cluster where one of the object nodes is not running

N.B. The locale of the node does not matter because the message
template is interpolated with node/device data from the Ring which is
always unicode because of json.

This includes an update to the FakeRing used by unittest
infrastructure to ensure that the FakeRing devices make a round-trip
through json to ensure consistent typing with real Rings.

Change-Id: Icb7284eb5abc9869c1620ee6366817112d8e5587
Closes-bug: #1597210
2016-07-05 16:33:15 -07:00
Alistair Coles
3ad003cf51 Enable middleware to set metadata on object POST
Adds a new form of system metadata for objects.

Sysmeta cannot be updated by an object POST because
that would cause all existing sysmeta to be deleted.
Crypto middleware will want to add 'system' metadata
to object metadata on PUTs and POSTs, but it is ok
for this metadata to be replaced en-masse on every
POST.

This patch introduces x-object-transient-sysmeta-*
that is persisted by object servers and returned
in GET and HEAD responses, just like user metadata,
without polluting the x-object-meta-* namespace.
All headers in this namespace will be filtered
inbound and outbound by the gatekeeper, so cannot
be set or read by clients.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Janie Richling <jrichli@us.ibm.com>

Change-Id: I5075493329935ba6790543fc82ea6e039704811d
2016-06-28 11:00:33 +01:00
Janie Richling
03b762e80a Support for http footers - Replication and EC
Before this patch, the proxy ObjectController supported sending
metadata from the proxy server to object servers in "footers" that
trail the body of HTTP PUT requests, but this support was for EC
policies only.  The encryption feature requires that footers are sent
with both EC and replicated policy requests in order to persist
encryption specific sysmeta, and to override container update headers
with an encrypted Etag value.

This patch:

- Moves most of the functionality of ECPutter into a generic Putter
  class that is used for replicated object PUTs without footers.

- Creates a MIMEPutter subclass to support multipart and multiphase
  behaviour required for any replicated object PUT with footers and
  all EC PUTs.

- Modifies ReplicatedObjectController to use Putter objects in place
  of raw connection objects.

- Refactors the _get_put_connections method and _put_connect_node methods
  so that more code is in the BaseObjectController class and therefore
  shared by [EC|Replicated]ObjectController classes.

- Adds support to call a callback that middleware may have placed
  in the environ, so the callback can set footers. The
  x-object-sysmeta-ec- namespace is reserved and any footer values
  set by middleware in that namespace will not be forwarded to
  object servers.

In addition this patch enables more than one value to be added to the
X-Backend-Etag-Is-At header. This header is used to point to an
(optional) alternative sysmeta header whose value should be used when
evaluating conditional requests with If-[None-]Match headers.  This is
already used with EC policies when the ECObjectController has
calculated the actual body Etag and sent it using a footer
(X-Object-Sysmeta-EC-Etag). X-Backend-Etag-Is-At is in that case set
to X-Object-Sysmeta-Ec-Etag so as to point to the actual body Etag
value rather than the EC fragment Etag.

Encryption will also need to add a pointer to an encrypted Etag value.
However, the referenced sysmeta may not exist, for example if the
object was created before encryption was enabled. The
X-Backend-Etag-Is-At value is therefore changed to support a list of
possible locations for alternate Etag values. Encryption will place
its expected alternative Etag location on this list, as will the
ECObjectController, and the object server will look for the first
object metadata to match an entry on the list when matching
conditional requests. That way, if the object was not encrypted then
the object server will fall through to using the EC Etag value, or in
the case of a replicated policy will fall through to using the normal
Etag metadata.

If your proxy has a third-party middleware that uses X-Backend-Etag-Is-At
and it upgrades before an object server it's talking to then conditional
requests may be broken.

UpgradeImpact

Co-Authored-By: Alistair Coles <alistair.coles@hpe.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>

Closes-Bug: #1594739
Change-Id: I12a6e41150f90de746ce03623032b83ed1987ee1
2016-06-22 11:55:49 +01:00
Alistair Coles
928c4790eb Refactor tests and add tests
Relocates some test infrastructure in preparation for
use with encryption tests, in particular moves the test
server setup code from test/unit/proxy/test_server.py
to a new helpers.py so that it can be re-used, and adds
ability to specify additional config options for the
test servers (used in encryption tests).

Adds unit test coverage for extract_swift_bytes and functional
test coverage for container listings. Adds a check on the content
and metadata of reconciled objects in probe tests.

Change-Id: I9bfbf4e47cb0eb370e7a74d18c78d67b6b9d6645
2016-06-15 16:36:25 +01:00
Kota Tsuyuzaki
fcb6e4cd3a Last-Modified header support on HEAD/GET container
This patch enables to show a x-put-timestamp as
a last-modified header in container-server.

Note that the last-modified header will be changed only when a
request for container (PUT container or POST container) comes into
Swift. i.e. some requests for objects (e.g. PUT object, POST object)
will never affect the last-modified value but only when using
python-swiftclient like as "swift upload", the last-modified will
be close to the upload time because python-swiftclient will make
a PUT container request for "swift upload" each time.

Change-Id: I9971bf90d24eee8921f67c02b7e2c80fd8995623
2016-06-07 12:02:03 +01:00
Alistair Coles
7b706926a8 Fix setup of manifest responses in SLO tests
The swift_bytes param is removed from the content-type
in the proxy object controller, so the SLO unit tests should
not be registering GET responses with FakeSwift that have
swift_bytes appended to the content-type.

Nor should submanifest segment dicts have swift_bytes appended to
their content-type values.

Also adds a test for the object controller and container server
handling of SLO swift_bytes.

Change-Id: Icf9bd87eee25002c8d9728b16e60c8347060f320
2016-05-23 17:26:28 +01:00
Tim Burke
2744492f30 Use the same key for memcache and env['swift.infocache']
When we were caching directly to the WSGI environment, it made sense to
have different keys for the different caches. Now that we have a
separate data structure for the per-request cache, however, we ought to
be consistent.

Change-Id: I199cba6e5fc9ab4205bba369e6a2f34fc5ce22d4
2016-05-16 18:43:32 -07:00
Jenkins
0fb92ce4ec Merge "Fix up get_account_info and get_container_info" 2016-05-16 23:58:51 +00:00
Jenkins
fceb8423c1 Merge "Make info caching work across subrequests" 2016-05-16 20:58:26 +00:00
Samuel Merritt
1c88d2cb81 Fix up get_account_info and get_container_info
get_account_info used to work like this:

  * make an account HEAD request

  * ignore the response

  * get the account info by digging around in the request environment,
    where it had been deposited by elves or something

Not actually elves, but the proxy's GETorHEAD_base method would take
the HEAD response and cache it in the response environment, which was
the same object as the request environment, thus enabling
get_account_info to find it.

This was extraordinarily brittle. If a WSGI middleware were to
shallow-copy the request environment, then any middlewares to its left
could not use get_account_info, as the left middleware's request
environment would no longer be identical to the response environment
down in GETorHEAD_base.

Now, get_account_info works like this:

  * make an account HEAD request.

  * if the account info is in the request environment, return it. This
    is an optimization to avoid a double-set in memcached.

  * else, compute the account info from the response headers, store it
    in caches, and return it.

This is much easier to think about; get_account_info can get and cache
account info all on its own; the cache check and cache set are right
next to each other.

All the above is true for get_container_info as well.

get_info() is still around, but it's just a shim. It was trying to
unify get_account_info and get_container_info to exploit the
commonalities, but the number of times that "if container:" showed up
in get_info and its helpers really indicated that something was
wrong. I'd rather have two functions with some duplication than one
function with no duplication but a bunch of "if container:" branches.

Other things of note:

  * a HEAD request to a deleted account returns 410, but
    get_account_info would return 404 since the 410 came from the
    account controller *after* GETorHEAD_base ran. Now
    get_account_info returns 410 as well.

  * cache validity period (recheck_account_existence and
    recheck_container_existence) is now communicated to
    get_account_info via an X-Backend header. This way,
    get_account_info doesn't need a reference to the
    swift.proxy.server.Application object.

  * both logged swift_source values are now correct for
    get_container_info calls; before, on a cold cache,
    get_container_info would call get_account_info but not pass along
    swift_source, resulting in get_account_info logging "GET_INFO" as
    the source. Amusingly, there was a unit test asserting this bogus
    behavior.

  * callers that modify the return value of get_account_info or of
    get_container_info don't modify what's stored in swift.infocache.

  * get_account_info on an account that *can* be autocreated but has
    not been will return a 200, same as a HEAD request. The old
    behavior was a 404 from get_account_info but a 200 from
    HEAD. Callers can tell the difference by looking at
    info['account_really_exists'] if they need to know the difference
    (there is one call site that needs to know, in container
    PUT). Note: this is for all accounts when the proxy's
    "account_autocreate" setting is on.

Change-Id: I5167714025ec7237f7e6dd4759c2c6eb959b3fca
2016-05-13 10:40:56 -07:00