5895 Commits

Author SHA1 Message Date
Zuul
07c8e8bcdc Merge "Object-server: add periodic greenthread yielding during file read." 2024-02-27 04:03:00 +00:00
Jianjian Huo
d5877179a5 Object-server: add periodic greenthread yielding during file read.
Currently, when object-server serves GET request and DiskFile
reader iterate over disk file chunks, there is no explicit
eventlet sleep called. When network outpace the slow disk IO,
it's possible one large and slow GET request could cause
eventlet hub not to schedule any other green threads for a
long period of time. To improve this, this patch add a
configurable sleep parameter into DiskFile reader, which
is 'cooperative_period' with a default value of 0 (disabled).

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I80b04bad0601b6cd6caef35498f89d4ba70a4fd4
2024-02-27 11:24:41 +11:00
Alistair Coles
2500fbeea9 proxy: don't use recoverable_node_timeout with x-newest
Object GET requests with a truthy X-Newest header are not resumed if a
backend request times out. The GetOrHeadHandler therefore uses the
regular node_timeout when waiting for a backend connection response,
rather than the possibly shorter recoverable_node_timeout. However,
previously while reading data from a backend response the
recoverable_node_timeout would still be used with X-Newest requests.

This patch simplifies GetOrHeadHandler to never use
recoverable_node_timeout when X-Newest is truthy.

Change-Id: I326278ecb21465f519b281c9f6c2dedbcbb5ff14
2024-02-26 09:54:36 +00:00
Alistair Coles
8061dfb1c3 proxy-server: de-duplicate _get_next_response_part method
Both GetOrHeadHandler (used for replicated policy GETs) and
ECFragGetter (used for EC policy GETs) have _get_next_response_part
methods that are very similar. This patch replaces them with a single
method in the common GetterBase superclass.

Both classes are modified to use *only* the Request instance passed to
their constructors. Previously their entry methods
(GetOrHeadHandler.get_working_response and
ECFragGetter.response_parts_iter) accepted a Request instance as an
arg and the class then variably referred to that or the Request
instance passed to the constructor. Both instances must be the same
and it is therefore safer to only allow the Request to be passed to
the constructor.

The 'newest' keyword arg is dropped from the GetOrHeadHandler
constructor because it is never used.

This refactoring patch makes no intentional behavioral changes, apart
from the text of some error log messages which have been changed to
differentiate replicated object GETs from EC fragment GETs.

Change-Id: I148e158ab046929d188289796abfbbce97dc8d90
2024-02-26 09:50:22 +00:00
Zuul
50336c5098 Merge "test: all primary error limit is error" 2024-02-21 19:52:20 +00:00
Zuul
439dc93cc4 Merge "Add ClosingIterator class; be more explicit about closes" 2024-02-21 18:35:42 +00:00
Clay Gerrard
89dd515310 test: all primary error limit is error
Change-Id: Ib790be26a2b990f313484f9ebdc99b8dc14613c9
2024-02-21 10:32:19 -06:00
Zuul
3aba22fde5 Merge "Stop using deprecated datetime.utc* functions" 2024-02-15 01:34:31 +00:00
Tim Burke
c522f5676e Add ClosingIterator class; be more explicit about closes
... in document_iters_to_http_response_body.

We seemed to be relying a little too heavily upon prompt garbage
collection to log client disconnects, leading to failures in
test_base.py::TestGetOrHeadHandler::test_disconnected_logging
under python 3.12.

Closes-Bug: #2046352
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I4479d2690f708312270eb92759789ddce7f7f930
2024-02-12 11:16:09 +00:00
Zuul
51ae9b00c9 Merge "lint: Consistently use assertIsInstance" 2024-02-08 04:36:37 +00:00
Zuul
ad41371005 Merge "lint: Up-rev hacking" 2024-02-08 04:33:39 +00:00
Zuul
93d654024a Merge "diskfile: Ignore invalid suffixes in invalidations file" 2024-02-08 01:53:36 +00:00
Zuul
4d3f9fe952 Merge "sharding: don't replace own_shard_range without an epoch" 2024-02-08 01:04:58 +00:00
Tim Burke
ce9e56a6d1 lint: Consistently use assertIsInstance
This has been available since py32 and was backported to py27; there
is no point in us continuing to carry the old idiom forward.

Change-Id: I21f64b8b2970e2dd5f56836f7f513e7895a5dc88
2024-02-07 15:48:39 -08:00
Tim Burke
76ca11773e lint: Up-rev hacking
Last time we did this was nearly 4 years ago; drag ourselves into
something approaching the present. Address a few new pyflakes issues
that seem reasonable to enforce:

   E275 missing whitespace after keyword
   E231 missing whitespace after ','
   E721 do not compare types, for exact checks use `is` / `is not`,
        for instance checks use `isinstance()`

Main motivator is that the old hacking kept us on an old version
of flake8 et al., which no longer work with newer Pythons.

Change-Id: I54b46349fabb9776dcadc6def1cfb961c123aaa0
2024-02-07 15:48:39 -08:00
Matthew Oliver
8227f4539c sharding: don't replace own_shard_range without an epoch
We've observed a root container suddenly thinks it's unsharded when it's
own_shard_range is reset.  This patch blocks a remote osr with an epoch
of None from overwriting a local epoched OSR.

The only way we've observed this happen is when a new replica or handoff
node creates a container and it's new own_shard_range is created without
an epoch and then replicated to older primaries.

However, if a bad node with a non-epoched OSR is on a primary, it's
newer timestamp would prevent pulling the good osr from it's peers.  So
it'll be left stuck with it's bad one.

When this happens expect to see a bunch of:
    Ignoring remote osr w/o epoch: x, from: y

When an OSR comes in from a replica that doesn't have an epoch when
it should, we do a pre-flight check to see if it would remove the epoch
before emitting the error above. We do this because when sharding is
first initiated it's perfectly valid to get OSR's without epochs from
replicas. This is expected and harmless.

Closes-bug: #1980451
Change-Id: I069bdbeb430e89074605e40525d955b3a704a44f
2024-02-07 13:37:58 -08:00
Tim Burke
c5d743347c diskfile: Ignore invalid suffixes in invalidations file
Change-Id: I0357939cf3a12712e6719c257705cf565e3afc8b
2024-02-06 20:24:03 -08:00
Tim Burke
1936f6735c replicator: Rename update_deleted to revert
This is a more-intuitive name for what's going on and it's been working
well for us in the reconstructor.

Change-Id: Id935de4ca9eb6f38b0d587eaed8d13c54bd89d60
2024-02-06 20:24:03 -08:00
Zuul
afe31b4c01 Merge "tests: Fix float expectations for py312" 2024-02-06 10:16:53 +00:00
Zuul
0cb02a6ce5 Merge "proxy: don't send multi-part terminator when no parts sent" 2024-02-05 20:22:43 +00:00
Tim Burke
e96a081024 tests: Fix float expectations for py312
From https://docs.python.org/3/whatsnew/3.12.html :

   sum() now uses Neumaier summation to improve accuracy and
   commutativity when summing floats or mixed ints and floats.

At least, I *think* that's what was causing the ring builder failures.

Partial-Bug: #2046352
Change-Id: Icae2f1e3e95f216d214636bd5a6d1f40aacab20d
2024-02-05 10:29:32 -08:00
Alistair Coles
dc3eda7e89 proxy: don't send multi-part terminator when no parts sent
If the proxy timed out while reading a replicated policy multi-part
response body, it would transform the ChunkReadTimeout to a
StopIteration. This masks the fact that the backend read has
terminated unexpectedly. The document_iters_to_multipart_byteranges
would complete iterating over parts and send a multipart terminator
line, even though no parts may have been sent.

This patch removes the conversion of ChunkReadTmeout to StopIteration.
The ChunkReadTimeout that is now raised prevents the
document_iters_to_multipart_byteranges 'for' loop completing and
therefore stops the multi-part terminator line being sent. It is
raised from the GetOrHeadHandler similar to other scenarios that raise
ChunkReadTimeouts while the resp body is being read.

A ChunkReadTimeout exception handler is removed in the
_iter_parts_from_response method. This handler was previously never
reached (because StopIteration rather than ChunkReadTimeout was raised
from _get_next_response_part), but if it were reached (i.e. with this
change) then it would repeat logging of the error and repeat
incrementing the node's error counter.

This change in the GetOrHeadHandler mimics a similar change in the
ECFragGetter [1].

[1] Related-Chage: I0654815543be3df059eb2875d9b3669dbd97f5b4
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Change-Id: I6dd53e239f5e7eefcf1c74229a19b1df1c989b4a
2024-02-05 10:28:40 +00:00
Zuul
486fb23447 Merge "proxy: only use listing shards cache for 'auto' listings" 2024-02-01 11:59:47 +00:00
Alistair Coles
252f0d36b7 proxy: only use listing shards cache for 'auto' listings
The proxy should NOT read or write to memcache when handling a
container GET that explicitly requests 'shard' or 'object' record
type. A request for 'shard' record type may specify 'namespace'
format, but this request is unrelated to container listings or object
updates and passes directly to the backend.

This patch also removes unnecessary JSON serialisation and
de-serialisation of namespaces within the proxy GET path when a
sharded object listing is being built. The final response body will
contain a list of objects so there is no need to write intermediate
response bodies with a list of namespaces.

Requests that explicitly specify record type of 'shard' will of
course still have the response body with serialised shard dicts that
is returned from the backend.

Change-Id: Id79c156432350c11c52a4004d69b85e9eb904ca6
2024-01-31 11:02:54 +00:00
Zuul
bdbabbb809 Merge "test: swift.proxy_logging_status is really lazy (in a good way!)" 2024-01-25 23:17:04 +00:00
Clay Gerrard
0a6daa1ad5 test: swift.proxy_logging_status is really lazy (in a good way!)
Related-Change-Id: I9b5cc6d5fb69a2957b8c4846ce1feed8c115e6b6
Change-Id: I5dda9767c1c66597291211a087f7c917ba990651
2024-01-25 15:11:28 -06:00
Zuul
4eda676e2e Merge "Support swift.proxy_logging_status in request env" 2024-01-25 21:00:56 +00:00
Alistair Coles
a16e1f55a7 Improve unit tests for proxy GET ChunkReadTimeouts
Unit test changes only:

- Add tests for some resuming replicated GET scenarios.

- Add test to cover resuming GET fast_forward "failing" when range
  read is complete.

- Add test to verify different node_timeout for account and container
  vs object controller getters.

- Refactor proxy.test_server.py tests to split out different
  scenarios.

Drive-by: remove some ring device manipulation setup that's not needed.

Change-Id: I38c7fa648492c9bd2173ecf92f89e423bee4abf3
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
2024-01-25 14:13:48 +00:00
Matthew Oliver
b1836f9368 Update malformed_example.db to actually be malformed
Seems since somewhere around sqlite 3.40+ our in tests malformed sqlite
db isn't malformed anymore. I don't actually know how it was malformed
but looking in a hex editor it seems to have a bunch of null
truncated in the middle of the file. Which maybe isn't an issue anymore?

Instead I've gone and messed up what looks like to be the marker before
defining the test table data at the end of file, so from:

  00001FF0  00 00 00 00 00 00 00 00 00 00 00 03 01 02 0F 31 ...............1
                                             ^^

To:

  00001FF0  00 00 00 00 00 00 00 00 00 00 00 FF 01 02 0F 31 ...............1
                                             ^^

Basically FF'ed the start of the data marker (at least what I'm calling
it).

Closes-Bug: #2051067
Change-Id: I2a10adffa39abbf7e97718b7228de298209140f8
2024-01-25 04:37:04 +00:00
Zuul
52321866d9 Merge "tests: Exercise recent eventlet breakage without XFS" 2024-01-18 21:52:07 +00:00
Zuul
03b033f70f Merge "Work with latest eventlet (again)" 2024-01-18 19:24:37 +00:00
Zuul
4a278ae03f Merge "cli: add --sync to db info to show syncs" 2024-01-18 18:56:42 +00:00
Tim Burke
e39078135e tests: Exercise recent eventlet breakage without XFS
Recently, upper-constraints updated eventlet. Unfortunately, there
was a bug which breaks our unit tests which was not discovered during
the cross-project testing because the affected unit tests require an
XFS temp dir. The requirements change has since been reverted, but we
ought to have tests that cover the problematic behavior that will
actually run as part of cross-project testing.

See https://github.com/eventlet/eventlet/pull/826 for the eventlet
change that introduced the bug; it has since been fixed on master in
https://github.com/eventlet/eventlet/pull/890 (though we still need
https://review.opendev.org/c/openstack/swift/+/905796 to be able to
work with eventlet master).

Change-Id: I4a6d79317b65f746ee29d2d25073b8c3859cd6a0
2024-01-18 10:35:52 -08:00
Zuul
569525a937 Merge "tests: Get test_handoff_non_durable passing with encryption enabled" 2024-01-18 08:47:36 +00:00
Tim Burke
7e3925aa9c tests: Fix probe test when encryption is enabled
Change-Id: I94e8cfd154aa058d91255efc87776224a919f572
2024-01-17 10:19:08 -08:00
Tim Burke
3ab9e45d6e Work with latest eventlet (again)
See https://github.com/eventlet/eventlet/pull/826 and its follow-up,
https://github.com/eventlet/eventlet/pull/890

Change-Id: I7dff5342013a3f31f19cb410a9f3f6d4b60938f1
2024-01-16 15:12:37 -08:00
Matthew Oliver
52c80d652d cli: add --sync to db info to show syncs
When looking at containers and accounts it's sometimes nice to know who
they've been replicating with. This patch adds a `--sync|-s` option to
swift-{container|account}-info which will also dump the incoming and
outgoing sync tables:

  $ swift-container-info /srv/node3/sdb3/containers/294/624/49b9ff074c502ec5e429e7af99a30624/49b9ff074c502ec5e429e7af99a30624.db -s
  Path: /AUTH_test/new
    Account: AUTH_test
    Container: new
    Deleted: False
    Container Hash: 49b9ff074c502ec5e429e7af99a30624
  Metadata:
    Created at: 2022-02-16T05:34:05.988480 (1644989645.98848)
    Put Timestamp: 2022-02-16T05:34:05.981320 (1644989645.98132)
    Delete Timestamp: 1970-01-01T00:00:00.000000 (0)
    Status Timestamp: 2022-02-16T05:34:05.981320 (1644989645.98132)
    Object Count: 1
    Bytes Used: 7
    Storage Policy: default (0)
    Reported Put Timestamp: 1970-01-01T00:00:00.000000 (0)
    Reported Delete Timestamp: 1970-01-01T00:00:00.000000 (0)
    Reported Object Count: 0
    Reported Bytes Used: 0
    Chexor: 962368324c2ca023c56669d03ed92807
    UUID: f33184e7-56d5-4c74-9d2e-5417c187d722-sdb3
    X-Container-Sync-Point2: -1
    X-Container-Sync-Point1: -1
  No system metadata found in db file
  No user metadata found in db file
  Sharding Metadata:
    Type: root
    State: unsharded
  Incoming Syncs:
    Sync Point	Remote ID                                	Updated At
    1         	ce7268a1-f5d0-4b83-b993-af17b602a0ff-sdb1	2022-02-16T05:38:22.000000 (1644989902)
    1         	2af5abc0-7f70-4e2f-8f94-737aeaada7f4-sdb4	2022-02-16T05:38:22.000000 (1644989902)
  Outgoing Syncs:
    Sync Point	Remote ID	Updated At
  Partition	294
  Hash     	49b9ff074c502ec5e429e7af99a30624

As a follow up to the device in DB ID patch we can see that the replicas
at sdb1 and sdb4 have replicated with this node.

Change-Id: I23d786e82c6710bea7660a9acf8bbbd113b5b727
2024-01-16 08:19:08 -08:00
Zuul
2331c9abf2 Merge "tests: Switch get_v4_amz_date_header to take timedeltas" 2024-01-12 18:49:47 +00:00
Matthew Oliver
03b66c94f4 Proxy: Use namespaces when getting listing/updating shards
With the Related-Change, container servers can return a list Namespace
objects in response to a GET request.  This patch modifies the proxy
to take advantage of this when fetching namespaces. Specifically,
the proxy only needs Namespaces when caching 'updating' or 'listing'
shard range metadata.

In order to allow upgrades to clusters we can't just send
'X-Backend-Record-Type = namespace', as old container servers won't
know how to respond. Instead, proxies send a new header
'X-Backend-Record-Shard-Format = namespace' along with the existing
'X-Backend-Record-Type = shard' header. Newer container servers will
return namespaces, old container servers continue to return full
shard ranges and they are parsed as Namespaces by the new proxy.

This patch refactors _get_from_shards to clarify that it does not
require ShardRange objects. The method is now passed a list of
namespaces, which is parsed from the response body before the method
is called. Some unit tests are also refactored to be more realistic
when mocking _get_from_shards.

Also refactor the test_container tests to better test shard-range and
namespace responses from legacy and modern container servers.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Jianjian Huo <jhuo@nvidia.com>
Related-Change: If152942c168d127de13e11e8da00a5760de5ae0d
Change-Id: I7169fb767525753554a40e28b8c8c2e265d08ecd
2024-01-11 10:46:53 +00:00
Jianjian Huo
c073933387 Container-server: add container namespaces GET
The proxy-server makes GET requests to the container server to fetch
full lists of shard ranges when handling object PUT/POST/DELETE and
container GETs, then it only stores the Namespace attributes (lower
and name) of the shard ranges into Memcache and reconstructs the list
of Namespaces based on those attributes. Thus, a namespaces GET
interface can be added into the backend container-server to only
return a list of those Namespace attributes.

On a container server setup which serves a container with ~12000
shard ranges, benchmarking results show that the request rate of the
HTTP GET all namespaces (states=updating) is ~12 op/s, while the
HTTP GET all shard ranges (states=updating) is ~3.2 op/s.

The new namespace GET interface supports most of headers and
parameters supported by shard range GET interface. For example,
the support of marker, end_marker, include, reverse and etc. Two
exceptions are: 'x-backend-include-deleted' cannot be supported
because there is no way for a Namespace to indicate the deleted state;
the 'auditing' state query parameter is not supported because it is
specific to the sharder which only requests full shard ranges.

Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: If152942c168d127de13e11e8da00a5760de5ae0d
2024-01-11 10:46:53 +00:00
Zuul
c1c41a145e Merge "Get tests passing with latest eventlet" 2024-01-10 22:49:56 +00:00
Zuul
7d5c73fcde Merge "ContainerBroker.get_shard_ranges(): states must be a list" 2024-01-09 20:38:06 +00:00
Alistair Coles
f3a32367bf ContainerBroker.get_shard_ranges(): states must be a list
The 'states' argument of get_shard_ranges() should be a list of ints,
but previously just a single int was tolerated. This was unnecessary
and led to inconsistent usage across call sites.

We'd like similar ContainerBroker methods, such as the anticipated
get_namespaces() [1], to have an interface consistent with
get_shard_ranges(), but not continue the unnecessary pattern of
supporting both a list and a single int argument for 'states'. This
patch therefore normalises all call sites to pass a list and
deprecates support for just a single int.

[1] Related-Change: If152942c168d127de13e11e8da00a5760de5ae0d
Change-Id: I056cefbf0894dbc68b9a6eb3d76ec4dc0a72de0d
2024-01-09 11:50:04 +00:00
Zuul
a2a09a77bc Merge "Make the dark data watcher work with sharded containers" 2024-01-09 09:47:01 +00:00
Tim Burke
6b91334298 Make the dark data watcher work with sharded containers
Be willing to accept shards instead of objects when querying containers.
If we receive shards, be willing to query them looking for the object.

Change-Id: I0d8dd42f81b97dddd6cf8910afaef4ba85e67d27
Partial-Bug: #1925346
2024-01-09 15:11:45 +11:00
Alistair Coles
f2c6c19411 container-server unit tests: use self.ts everywhere
The setUp method creates a timestamp iterator, so let's use it
consistently in all the tests.

Change-Id: Ibd06b243c6db93380b99227ac79157269a64b28a
2024-01-05 12:17:46 +00:00
Tim Burke
603122a32c Stop using deprecated datetime.utc* functions
Change-Id: I9a205a4191e9b26784e507262cb66a1190c2bc71
2024-01-04 05:35:24 +00:00
Tim Burke
bf7f3ff2f9 tests: Switch get_v4_amz_date_header to take timedeltas
Change-Id: Ic89141c0dce619390c2be8a01d231f9ff8e2056c
2024-01-04 00:34:22 +00:00
Tim Burke
fe0d138eab Get tests passing with latest eventlet
Previously, our tests would not just fail, but segfault on recent
eventlet releases. See https://github.com/eventlet/eventlet/issues/864
and https://github.com/python/cpython/issues/113631

Fortunately, it looks like we can just avoid actually monkey-patching
to dodge the bug.

Closes-Bug: #2047768
Change-Id: I0dc22dab05bc00722671dca3f0e6eb1cf6e18349
2024-01-02 12:40:55 -08:00
Clay Gerrard
5af7719ef3 Support swift.proxy_logging_status in request env
When logging a request, if the request environ has a
swift.proxy_logging_status item then use its value for the log
message status int.

The swift.proxy_logging_status hint may be used by other middlewares
when the desired logged status is different from the wire_status_int.

If the proxy_logging middleware detects a client disconnect then any
swift.proxy_logging_status item is ignored and a 499 status int is
logged, as per current behaviour. i.e.:

  * client disconnect overrides swift.proxy_logging_status and the
    response status
  * swift.proxy_logging_status overrides the response status

If the proxy_logging middleware catches an exception then the logged
status int will be 500 regardless of any swift.proxy_logging_status
item.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I9b5cc6d5fb69a2957b8c4846ce1feed8c115e6b6
2023-12-20 17:31:06 +00:00