Both the EC and replicated GET paths have is_good_source() methods
that are similar. Refactor to have just one implementation.
Change-Id: I661d3704a76e3d92bfcfeaed1fff4ed5e28c79b4
This patch tries to make the proxy-server GET path a little easier to
read. The changes are primarily to the GetOrHeadHandler and
ECFragGetter classes.
- Make generator method names more uniform.
- Use _<method> to make it more obvious which methods are internal
to the class and which methods are the interface to the class.
- Move the interface method to the end of the class.
- Add some commentary and docstrings.
No intended behavioral change.
Change-Id: I3d00b1071669a42526a31588a2643f91c58ea5a8
The proxy GetterBase manages a set of attributes related to the
backend source for a GET response: source (a response object), node
and source_parts_iter. These attributes are always updated together so
benefit from encapsulation, along with some helper methods to simplify
the GET paths.
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I76ea042ef4b3f5cc1caa4774e35d6191f4ca548e
Currently we simply mock calls in the FakeLogger for calls statsd calls,
and there are also some helper methods for counting and collating
metrics that were called. This Fakelogger is overloaded and doesn't
simulate the real world.
In real life we use a Statsdclient that is attached to the logger.
We've been in the situation where unit tests pass but the statsd client
stacktraces because we don't actually fake the statsdclient based off
the real one and let it's use its internal logic.
This patch creates a new FakeStatsdClient that is based off the real
one, this can then be used (like the real statsd client) and attached to
the FakeLogger.
There is quite a bit of churn in tests to make this work, because we now
have to looking into the fake statsd client to check the faked calls
made.
The FakeStatsdClient does everything the real one does, except overrides
the _send method and socket creation so no actual statsd metrics are
emitted.
Change-Id: I9cdf395e85ab559c2b67b0617f898ad2d6a870d4
Personally I'm not a big fan of how we arrange logs for SAIO,
but it is a historic standard. The reconciler has to conform.
Change-Id: I45a25ff406b31b6b1b403e213554aaabfebc6eb5
The get_own_shard_range() method is called during every container GET
or HEAD request (as well as at other times). The method delegates to
get_shard_ranges() which queries the DB for shard ranges. Although the
query has conditions to only select a single shard range name, and the
method will return only the first matching shard range, the query had
no LIMIT. Adding a LIMIT of 1 significantly reduces execution time
when the DB has many shard range rows.
On the author's machine, using a DB with ~12000 shard ranges this
patch reduces the get_own_shard_range() execution time by a factor of
approximately 100.
Change-Id: I43f79de3f0603b9fab8bf6f7b4c3b1892a8919b3
This patch will add more granularity to metrics of account_info or
container_info cache and related backend lookups.
Before this patch, related metrics are:
1.account.info.cache.[hit|miss|skip]
2.container.info.cache.[hit|miss|skip]
With this patch, they are going to become:
1.account/container.info.infocache.hit
cache hits with infocache.
2.account/container.info.cache.hit
cache hits with memcache.
3.account/container.info.cache.[miss|skip|disabled]
.<status_int>
Those are operations made to backend due to below reasons.
miss: cache misses.
skip: the selective skips per skip percentage config.
disabled: memcache is disabled.
For each kind of operation metrics, suffix <status_int> will
count operations with different status. Then a sum of all
status sub-metrics will the total metrics of that operation.
UpgradeImpact
=============
Metrics dashboard will need updates to display those changed metrics
correctly, also some infocache metrics are newly added, please see
above message for all changes needed.
Change-Id: I60a9f1c349b4bc78ecb850fb26ae56eb20fa39c6
A while back, we changed get_account_info and get_container_info to
call the proxy-server app directly, rather than whatever was right
of the current middleware. This reduced backend request amplification
on cache misses.
However, it *also* meant that we stopped emitting logs or metrics in
the proxy for these backend requests. This was an unfortunate and
unintended break from earlier behavior.
Now, extend the middleware decorating we're doing in loadapp() to
include a "logged app", i.e., the app wrapped by it's right-most
proxy-logging middleware. If there is not logging middleware (such
as would happen for the backend servers), the "logged app" will be
the main app. Make account and container info requests through
*that* app, so we get logging and metrics again.
Closes-Bug: #2027726
Related-Change: I49447c62abf9375541f396f984c91e128b8a05d5
Change-Id: I3f531f904340e4c8407185ed64b41d7d614a308a
The client_chunk_size attribute was introduced into GetOrHeadHandler
for EC support [1]. It was only ever not None for an
ECObjectController. The ECObjectController stopped using
GetOrHeadHandler for Object GET when the ECFragGetter class was
introduced [2], but the EC specific code was not expunged from
GetOrHeadHandler. In [3] the ECFragGetter client_chunk_size was renamed
to fragment_size to better reflect what it represented.
The skip_bytes attribute was similarly introduced for EC support. It
is only ever non-zero if client_chunk_size is an int. For EC,
skip_bytes is used to undo the effect of expanding the backend
range(s) to fetch whole fragments: the range(s) of decoded bytes
returned to the client may need to be narrower than the backend
ranges. There is no equivalent requirement for replicated GETs.
The elimination of client_chunk_size and skip_bytes simplifies the
yielding of chunks from the GetOrHeadHandler response iter.
Related-Change:
[1] I9c13c03616489f8eab7dcd7c5f21237ed4cb6fd2
[2] I0dc5644a84ededee753e449e053e6b1786fdcf32
[3] Ie1efaab3bd0510275d534b5c023cb73c98bec90d
Change-Id: I31ed36d32682469e3c5ca8bf9a2b383568d63c72
The Related-Change extracted some attributes of the ShardRange class
into a new Namespace superclass. This patch adds a docstring for
Namespace.
Change-Id: I5a79c4a6da6e62698403bb0a5ef566355b5c850e
Related-Change: If98af569f99aa1ac79b9485ce9028fdd8d22576b
Within ContainerBroker, at various places, for example get_db_state()
and sharding_initiated(), they query the number of shard ranges of
this container by pulling out all shard ranges from shard range table,
instantiating ShardRange objects and then counting how many they are.
Those operations are very expensive, and causing HEAD/GET into large
containers to be very slow.
Instead, this patch only checks if there is any qualified shard range
exists in the shard table with optimized SQL query, which can be very
fast. On a container server setup which serves a container with ~12000
shard ranges, this patch alone improves HTTP HEAD request rate by ~10X,
and improves HTTP GET request rate by ~2X; and together with other
optimizations (patch 888310 & 888575) targeting to fix similar problems,
strong synergistic effects are seen to bring total ~22X improvement to
HTTP HEAD and ~27X to HTTP GET request rates.
Change-Id: I01fd4f3e395c8846280f44e17a56935fc6210444
Each time the sharder audits a shard container it makes a GET request
to the root container for shard ranges, typically specifying
[end_]marker constraints to select a single shard range. When a
container DB has a large number of shard ranges, the execution time of
the get_shard_ranges() method was dominated by the time taken to
instantiate ShardRange objects for every shard range before then
selecting only those that satisfied the [end_]marker constraints. The
get_shard_ranges() call can be much faster if the [end_]marker
filtering is performed by the SQL query *before* instantiating
ShardRange objects for the selected shard ranges.
On the author's machine, using a DB with ~12000 shard ranges and
selecting a single shard range with marker and end_marker constraints,
this patch reduces the get_shard_ranges() execution time from
approximately 140ms to approximately 6ms.
A similar change was previously made to optimise calls with the
'includes' constraint [1].
[1] Related-Change: Ie40b62432dd9ccfd314861655f7972a2123938fe
Change-Id: Ic6e436f9e8fdff6a525466757f57f9be3a81c5b6
The 'log_route' argument of utils.get_logger() determines which global
Logger instance is wrapped by the returned LogAdapter. Most middlewares
(s3api being the exception) explicity set 'log_route' to equal the
middleware 'brief' name e.g. 'bulk', 'tempauth' etc. However, the
s3api middleware sets 'log_route' to be the config 'log_name', if that
key is found in config.
When a proxy pipeline is instantiated via wsgi.run_wsgi(), all
middlewares and the proxy app are passed a default conf with
'"log_name": "proxy-server"'. As a result, the s3api middleware calls
get_logger() with log_route='proxy-server' and its LogAdapter
therefore shares the same Logger instance used by proxy-server app
(and any other middleware that similarly fails to explicitly
differentiate 'log_route)'.
Each Logger instance has a StatsdClient instance bound to it by
get_logger(). The Related-Change added statsd metrics to the s3api
middleware and sets 's3api' as the 'statsd_tail_prefix' when calling
get_logger(). This had the unintended effect of replacing the shared
Logger instance's StatsdClient with one that has prefix 's3api', such
that stats emitted by the proxy app (e.g. memcache shard range
hit/miss stats) would be erroneously prefixed with 's3api'.
This patch modifies the s3api middleware logger instantiation to
explictly set log_route='s3api', so that the s3api middleware
LogAdapter now wraps a unique global Logger instance, with a unique
StatsdClient instance bound to it.
The 'server' attribute of the middleware's LogAdapter, which may be
included in log lines by the "%(server)s" format element, is not
affected by this change. Its value is derived from the config
'log_name' or the 'name' argument passed to get_logger().
Change-Id: Ia89485bae8f92f4f3d9f5375cab8ff08f70a11a7
Related-Change: I4976b3ee24e4ec498c66359f391813261d42c495
Currently, SLO manifest files will be evicted from page cache
after reading it, which cause hard drives very busy when user
requests a lot of parallel byte range GETs for a particular
SLO object.
This patch will add a new config 'keep_cache_slo_manifest', and
try keeping the manifest files in page cache by not evicting them
after reading if config settings allow so.
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I557bd01643375d7ad68c3031430899b85908a54f
It costs us an extra HEAD sometimes, but those at least get cached
some. Seems better than doing real GETs and going out to handoffs
every time when versioning isn't enabled.
Change-Id: Ic1b3a8c6c9b1aaead25070e49f514785c2d52c98