5926 Commits

Author SHA1 Message Date
Zuul
f7b9a72a6e Merge "object-expirer: test for mix of async and delayed deletes" 2024-04-26 15:04:15 +00:00
Zuul
b6c377e7e5 Merge "support x-open-expired header for expired objects" 2024-04-26 13:07:57 +00:00
Zuul
2e487ba695 Merge "expirer: account and container level delay_reaping" 2024-04-26 10:28:47 +00:00
Clay Gerrard
b490857b47 object-expirer: test for mix of async and delayed deletes
Related-Change: I106103438c4162a561486ac73a09436e998ae1f0
Change-Id: I95606aed5a5bd70424fbc51dd965f238fa7f064f
2024-04-26 10:53:19 +01:00
indianwhocodes
11eb17d3b2 support x-open-expired header for expired objects
If the global configuration option 'enable_open_expired' is set
to true in the config, then the client will be able to make a
request with the header 'x-open-expired' set to true in order
to access an object that has expired, provided it is in its
grace period. If this config flag is set to false, the client
will not be able to access any expired objects, even with the
header, which is the default behavior unless the flag is set.

When a client sets a 'x-open-expired' header to a true value for a
GET/HEAD/POST request the proxy will forward x-backend-open-expired to
storage server. The storage server will allow clients that set
x-backend-open-expired to open and read an object that has not yet
been reaped by the object-expirer, even after the x-delete-at time
has passed.

The header is always ignored when used with temporary URLs.

Co-Authored-By: Anish Kachinthaya <akachinthaya@nvidia.com>
Related-Change: I106103438c4162a561486ac73a09436e998ae1f0
Change-Id: Ibe7dde0e3bf587d77e14808b169c02f8fb3dddb3
2024-04-26 10:13:40 +01:00
Mandell Degerness
5961ba0ca7 expirer: account and container level delay_reaping
The object expirer can be configured to delay the reaping of
objects from disk after their expiration time using account
and container level delay_reaping values. The delay_reaping
value of accounts and containers in seconds is configured in
the object server config. The object expirer references these
configured values to only reap objects from specified accounts
and containers after their corresponding delays.

The goal of the delay_reaping feature is to prevent accidental or
premature data loss if an object marked for deletion with the
'x-delete-at' feature should not be reaped immediately, for
whatever reason.

Configuring the delay_reaping value at a granular account and
container level is beneficial for being able to keep storage
capacity consumption in control while maintaining a desired
data recovery window.

This patch also adds a sample configuration, documentation, and
tests for bad configurations and grace period functionality.

Co-Authored-By: Anish Kachinthaya <akachinthaya@nvidia.com>
Change-Id: I106103438c4162a561486ac73a09436e998ae1f0
2024-04-25 13:59:36 -07:00
Alistair Coles
3d5a97f76b proxy_logging: unit test first-byte.timing metrics
Add some test assertions to cover the first-byte timing metrics
introduced in the related change.

Add ttfb param to log_request docstring.

Change-Id: I530652dd672d7d4e5eac351ccbad318773414f7d
Related-Change: I1611e34846e586703e9d3709fa64e8df41f2d685
2024-04-19 12:33:48 +01:00
Clay Gerrard
d4435e1229 proxy-logging: emit stats more consistently
Change-Id: I526bbcc59c9eb5923c3784d5d06bc38998cb48db
2024-04-03 18:50:04 -05:00
Clay Gerrard
3c449e78e4 test: assert behavior of proxy_logging metrics
Change-Id: I651ddd40e9115a56727096d4a3aa84589146308f
2024-04-03 18:49:13 -05:00
Zuul
0e5aeb5045 Merge "s3api: Fix handling of non-ascii access keys" 2024-03-25 21:40:58 +00:00
Zuul
cdc4f264d2 Merge "recon-cron: Tolerate missing directories" 2024-03-25 06:45:56 +00:00
Tim Burke
8424b02290 s3api: Fix handling of non-ascii access keys
We stuff the access key into the request path until we get back a
more-authoritative account name from auth. But it needs to be a WSGI
string when we do!

Closes-Bug: #2058748
Change-Id: I34adb8141cc9e62d17a27f01c63f40d1dd25991c
2024-03-22 10:02:39 -07:00
Tim Burke
f31b6f7353 recon-cron: Tolerate missing directories
Any of these directories may get unlinked between when we saw them in
their parent's directory listing and when we go to descend.

Change-Id: I1dfc0ee1d9e70cb0600557cde980bd5880bd40b3
2024-03-21 14:10:14 -07:00
Zuul
fd3997f027 Merge "tests: Update CORS geckodriver" 2024-03-20 03:48:32 +00:00
Tim Burke
af15ad53fb tests: Update CORS geckodriver
Change-Id: I5ab762dfe0f85e346c4868ec4540884ba5f0a7f4
2024-03-14 20:45:47 -07:00
Jianjian Huo
27ef11ea14 test: implement cache expiration time in MockMemcached
Change-Id: I16ec414f87ac1a5e1e87e7560290c5ef0ca4f7cf
2024-03-15 13:50:32 +11:00
Zuul
b6dc24dbc0 Merge "s3api test for zero byte mpu" 2024-03-13 17:21:42 +00:00
Zuul
891d06345e Merge "s3api: Support GET/HEAD request with ?partNumber" 2024-03-13 08:28:03 +00:00
Zuul
60db1f847c Merge "slo: part-number=N query parameter support" 2024-03-13 00:13:45 +00:00
Clay Gerrard
d10351db30 s3api test for zero byte mpu
Change-Id: I89050cead3ef2d5f8ebfc9cb58f736f33b1c44fe
2024-03-12 13:48:02 +00:00
indianwhocodes
46e7da97c6 s3api: Support GET/HEAD request with ?partNumber
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Closes-Bug: #1735284
Change-Id: Ib396309c706fbc6bc419377fe23fcf5603a89f45
2024-03-12 13:47:55 +00:00
indianwhocodes
6adbeb4036 slo: part-number=N query parameter support
This change allows individual SLO segments to be downloaded by adding
an extra 'part-number' query parameter to the GET request.  You can
also retrieve the Content-Length of an individual segment with a HEAD
request.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I7af0dc9898ca35f042b52dd5db000072f2c7512e
2024-03-12 06:47:02 -07:00
Matthew Oliver
4135133a63 memcachering: change failed to yield log message
Currently when the memcachering `_get_conns` method runs out of memcached
servers to try and so fails to yield anything we log a:

  All memcached servers error-limited

However, this error message isn't entirely accurate. It can also fail
because it failed to connect all it's memcached servers not just because
they're error limited.
You can disable error-limiting of memcached servers. So in this case
this error message is a red-herring.

Downstream we use a mcrouter client on each node which itself talks to a
bunch of memcache servers. Therefore in swift's memcachering client we
only configure the 1 mcrouter client as a single server in the ring.
Because of this we disable memcached error-limiting.
If the node gets too overloaded we've had timeouts talking to the local
mcrouter client. This fires off error-limitted log messages which can
confuse things.

Because it's possible to turn off error-limiting, the log line isn't
quite adequate anymore. So this patch changes it to:

  No more memcached servers to try

Change-Id: I97fb4f3ee2ac45831aae14a782b2c6dc73e82d85
2024-03-05 14:44:37 +11:00
Zuul
3478803a95 Merge "zero bytes manifests are not legacy" 2024-02-28 17:34:04 +00:00
Zuul
e9cf2a31aa Merge "tests: Clear txn id on init for all debug loggers" 2024-02-28 17:05:34 +00:00
Zuul
0947e94f66 Merge "staticweb: Work with prefix-based tempurls" 2024-02-28 07:42:52 +00:00
Clay Gerrard
130188b6c0 zero bytes manifests are not legacy
Change-Id: I7c8adb129b8770eee501748a378f3adc42c8cd39
2024-02-27 17:21:00 -06:00
Zuul
4c5f41cc1f Merge "Fix diskfile test failing on macOS" 2024-02-27 21:32:05 +00:00
Tim Burke
1ee9b1e3ba tests: Clear txn id on init for all debug loggers
Since we fake out all the greenthread stuff to run in the main thread,
we can (sometimes?) find that a transaction ID has already been set,
leading to failures in test_bad_request_app_logging like

    AssertionError: b'X-Trans-Id: test-trans-id' not found
    in b'X-Trans-Id: tx...'

By resetting the logger's txn_id, we're assured that our mock will be
run and the expected transaction ID will be used.

Change-Id: I465eed5372a2a5e591f80a09676f4b7f091cd444
2024-02-27 09:49:50 -08:00
Zuul
07c8e8bcdc Merge "Object-server: add periodic greenthread yielding during file read." 2024-02-27 04:03:00 +00:00
Jianjian Huo
d5877179a5 Object-server: add periodic greenthread yielding during file read.
Currently, when object-server serves GET request and DiskFile
reader iterate over disk file chunks, there is no explicit
eventlet sleep called. When network outpace the slow disk IO,
it's possible one large and slow GET request could cause
eventlet hub not to schedule any other green threads for a
long period of time. To improve this, this patch add a
configurable sleep parameter into DiskFile reader, which
is 'cooperative_period' with a default value of 0 (disabled).

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I80b04bad0601b6cd6caef35498f89d4ba70a4fd4
2024-02-27 11:24:41 +11:00
Alistair Coles
2da150b890 Fix diskfile test failing on macOS
The existing test fails on macOS because the value of errno.ENODATA is
platform dependent. On macOS ENODATA is 96:

% man 2 intro|grep ENODATA
     96 ENODATA No message available.

Change-Id: Ibc760e641d4351ed771f2321dba27dc4e5b367c1
2024-02-26 11:12:47 +00:00
Alistair Coles
2500fbeea9 proxy: don't use recoverable_node_timeout with x-newest
Object GET requests with a truthy X-Newest header are not resumed if a
backend request times out. The GetOrHeadHandler therefore uses the
regular node_timeout when waiting for a backend connection response,
rather than the possibly shorter recoverable_node_timeout. However,
previously while reading data from a backend response the
recoverable_node_timeout would still be used with X-Newest requests.

This patch simplifies GetOrHeadHandler to never use
recoverable_node_timeout when X-Newest is truthy.

Change-Id: I326278ecb21465f519b281c9f6c2dedbcbb5ff14
2024-02-26 09:54:36 +00:00
Alistair Coles
8061dfb1c3 proxy-server: de-duplicate _get_next_response_part method
Both GetOrHeadHandler (used for replicated policy GETs) and
ECFragGetter (used for EC policy GETs) have _get_next_response_part
methods that are very similar. This patch replaces them with a single
method in the common GetterBase superclass.

Both classes are modified to use *only* the Request instance passed to
their constructors. Previously their entry methods
(GetOrHeadHandler.get_working_response and
ECFragGetter.response_parts_iter) accepted a Request instance as an
arg and the class then variably referred to that or the Request
instance passed to the constructor. Both instances must be the same
and it is therefore safer to only allow the Request to be passed to
the constructor.

The 'newest' keyword arg is dropped from the GetOrHeadHandler
constructor because it is never used.

This refactoring patch makes no intentional behavioral changes, apart
from the text of some error log messages which have been changed to
differentiate replicated object GETs from EC fragment GETs.

Change-Id: I148e158ab046929d188289796abfbbce97dc8d90
2024-02-26 09:50:22 +00:00
Zuul
50336c5098 Merge "test: all primary error limit is error" 2024-02-21 19:52:20 +00:00
Zuul
439dc93cc4 Merge "Add ClosingIterator class; be more explicit about closes" 2024-02-21 18:35:42 +00:00
Clay Gerrard
89dd515310 test: all primary error limit is error
Change-Id: Ib790be26a2b990f313484f9ebdc99b8dc14613c9
2024-02-21 10:32:19 -06:00
Zuul
3aba22fde5 Merge "Stop using deprecated datetime.utc* functions" 2024-02-15 01:34:31 +00:00
Tim Burke
c522f5676e Add ClosingIterator class; be more explicit about closes
... in document_iters_to_http_response_body.

We seemed to be relying a little too heavily upon prompt garbage
collection to log client disconnects, leading to failures in
test_base.py::TestGetOrHeadHandler::test_disconnected_logging
under python 3.12.

Closes-Bug: #2046352
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I4479d2690f708312270eb92759789ddce7f7f930
2024-02-12 11:16:09 +00:00
Zuul
51ae9b00c9 Merge "lint: Consistently use assertIsInstance" 2024-02-08 04:36:37 +00:00
Zuul
ad41371005 Merge "lint: Up-rev hacking" 2024-02-08 04:33:39 +00:00
Zuul
93d654024a Merge "diskfile: Ignore invalid suffixes in invalidations file" 2024-02-08 01:53:36 +00:00
Zuul
4d3f9fe952 Merge "sharding: don't replace own_shard_range without an epoch" 2024-02-08 01:04:58 +00:00
Tim Burke
ce9e56a6d1 lint: Consistently use assertIsInstance
This has been available since py32 and was backported to py27; there
is no point in us continuing to carry the old idiom forward.

Change-Id: I21f64b8b2970e2dd5f56836f7f513e7895a5dc88
2024-02-07 15:48:39 -08:00
Tim Burke
76ca11773e lint: Up-rev hacking
Last time we did this was nearly 4 years ago; drag ourselves into
something approaching the present. Address a few new pyflakes issues
that seem reasonable to enforce:

   E275 missing whitespace after keyword
   E231 missing whitespace after ','
   E721 do not compare types, for exact checks use `is` / `is not`,
        for instance checks use `isinstance()`

Main motivator is that the old hacking kept us on an old version
of flake8 et al., which no longer work with newer Pythons.

Change-Id: I54b46349fabb9776dcadc6def1cfb961c123aaa0
2024-02-07 15:48:39 -08:00
Matthew Oliver
8227f4539c sharding: don't replace own_shard_range without an epoch
We've observed a root container suddenly thinks it's unsharded when it's
own_shard_range is reset.  This patch blocks a remote osr with an epoch
of None from overwriting a local epoched OSR.

The only way we've observed this happen is when a new replica or handoff
node creates a container and it's new own_shard_range is created without
an epoch and then replicated to older primaries.

However, if a bad node with a non-epoched OSR is on a primary, it's
newer timestamp would prevent pulling the good osr from it's peers.  So
it'll be left stuck with it's bad one.

When this happens expect to see a bunch of:
    Ignoring remote osr w/o epoch: x, from: y

When an OSR comes in from a replica that doesn't have an epoch when
it should, we do a pre-flight check to see if it would remove the epoch
before emitting the error above. We do this because when sharding is
first initiated it's perfectly valid to get OSR's without epochs from
replicas. This is expected and harmless.

Closes-bug: #1980451
Change-Id: I069bdbeb430e89074605e40525d955b3a704a44f
2024-02-07 13:37:58 -08:00
Tim Burke
c5d743347c diskfile: Ignore invalid suffixes in invalidations file
Change-Id: I0357939cf3a12712e6719c257705cf565e3afc8b
2024-02-06 20:24:03 -08:00
Tim Burke
1936f6735c replicator: Rename update_deleted to revert
This is a more-intuitive name for what's going on and it's been working
well for us in the reconstructor.

Change-Id: Id935de4ca9eb6f38b0d587eaed8d13c54bd89d60
2024-02-06 20:24:03 -08:00
Zuul
afe31b4c01 Merge "tests: Fix float expectations for py312" 2024-02-06 10:16:53 +00:00
Tim Burke
8c4e65a6b5 staticweb: Work with prefix-based tempurls
Note that there's a bit of a privilege escalation as prefix-based
tempurls can now be used to perform listings -- but only on containers
with staticweb enabled. Since having staticweb enabled was previously
pretty useless unless the container was both public and
publicly-listable, I think it's probably fine.

This also allows tempurls to be used at the container level, but only
for staticweb responses.

Change-Id: I7949185fdd3b64b882df01d54a8bc158ce2d7032
2024-02-05 15:13:12 -08:00