Object Storage monitoring

Object Storage monitoring Excerpted from a blog post by Darrell Bishop An OpenStack Object Storage cluster is a collection of many daemons that work together across many nodes. With so many different components, you must be able to tell what is going on inside the cluster. Tracking server-level metrics like CPU utilization, load, memory consumption, disk usage and utilization, and so on is necessary, but not sufficient. What are different daemons are doing on each server? What is the volume of object replication on node8? How long is it taking? Are there errors? If so, when did they happen? In such a complex ecosystem, you can use multiple approaches to get the answers to these questions. This section describes several approaches.

Swift Recon The Swift Recon middleware (see http://swift.openstack.org/admin_guide.html#cluster-telemetry-and-monitoring) provides general machine statistics, such as load average, socket statistics, /proc/meminfo contents, and so on, as well as Swift-specific metrics: The MD5 sum of each ring file. The most recent object replication time. Count of each type of quarantined file: Account, container, or object. Count of “async_pendings” (deferred container updates) on disk. Swift Recon is middleware that is installed in the object servers pipeline and takes one required option: A local cache directory. To track async_pendings, you must set up an additional cron job for each object server. You access data by either sending HTTP requests directly to the object server or using the swift-recon command-line client. There are some good Object Storage cluster statistics but the general server metrics overlap with existing server monitoring systems. To get the Swift-specific metrics into a monitoring system, they must be polled. Swift Recon essentially acts as a middleware metrics collector. The process that feeds metrics to your statistics system, such as collectd and gmond, probably already runs on the storage node. So, you can choose to either talk to Swift Recon or collect the metrics directly.

Swift-Informant Florian Hines developed the Swift-Informant middleware (see http://pandemicsyn.posterous.com/swift-informant-statsd-getting-realtime-telem) to get real-time visibility into Object Storage client requests. It sits in the pipeline for the proxy server, and after each request to the proxy server, sends three metrics to a StatsD server (see http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/): A counter increment for a metric like obj.GET.200 or cont.PUT.404. Timing data for a metric like acct.GET.200 or obj.GET.200. [The README says the metrics look like duration.acct.GET.200, but I do not see the duration in the code. I am not sure what the Etsy server does but our StatsD server turns timing metrics into five derivative metrics with new segments appended, so it probably works as coded. The first metric turns into acct.GET.200.lower, acct.GET.200.upper, acct.GET.200.mean, acct.GET.200.upper_90, and acct.GET.200.count]. A counter increase by the bytes transferred for a metric like tfer.obj.PUT.201. This is good for getting a feel for the quality of service clients are experiencing with the timing metrics, as well as getting a feel for the volume of the various permutations of request server type, command, and response code. Swift-Informant also requires no change to core Object Storage code because it is implemented as middleware. However, it gives you no insight into the workings of the cluster past the proxy server. If the responsiveness of one storage node degrades, you can only see that some of your requests are bad, either as high latency or error status codes. You do not know exactly why or where that request tried to go. Maybe the container server in question was on a good node but the object server was on a different, poorly-performing node.

Statsdlog Florian’s Statsdlog project increments StatsD counters based on logged events. Like Swift-Informant, it is also non-intrusive, but statsdlog can track events from all Object Storage daemons, not just proxy-server. The daemon listens to a UDP stream of syslog messages and StatsD counters are incremented when a log line matches a regular expression. Metric names are mapped to regex match patterns in a JSON file, allowing flexible configuration of what metrics are extracted from the log stream. Currently, only the first matching regex triggers a StatsD counter increment, and the counter is always incremented by one. There is no way to increment a counter by more than one or send timing data to StatsD based on the log line content. The tool could be extended to handle more metrics for each line and data extraction, including timing data. But a coupling would still exist between the log textual format and the log parsing regexes, which would themselves be more complex to support multiple matches for each line and data extraction. Also, log processing introduces a delay between the triggering event and sending the data to StatsD. It would be preferable to increment error counters where they occur and send timing data as soon as it is known to avoid coupling between a log string and a parsing regex and prevent a time delay between events and sending data to StatsD. The next section describes another method for gathering Object Storage operational metrics.

Swift StatsD logging StatsD (see http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/) was designed for application code to be deeply instrumented; metrics are sent in real-time by the code that just noticed or did something. The overhead of sending a metric is extremely low: a sendto of one UDP packet. If that overhead is still too high, the StatsD client library can send only a random portion of samples and StatsD approximates the actual number when flushing metrics upstream. To avoid the problems inherent with middleware-based monitoring and after-the-fact log processing, the sending of StatsD metrics is integrated into Object Storage itself. The submitted change set (see https://review.openstack.org/#change,6058) currently reports 124 metrics across 15 Object Storage daemons and the tempauth middleware. Details of the metrics tracked are in the Administrator's Guide. The sending of metrics is integrated with the logging framework. To enable, configure log_statsd_host in the relevant config file. You can also specify the port and a default sample rate. The specified default sample rate is used unless a specific call to a statsd logging method (see the list below) overrides it. Currently, no logging calls override the sample rate, but it is conceivable that some metrics may require accuracy (sample_rate == 1) while others may not. [DEFAULT] ... log_statsd_host = 127.0.0.1 log_statsd_port = 8125 log_statsd_default_sample_rate = 1 Then the LogAdapter object returned by get_logger(), usually stored in self.logger, has these new methods: set_statsd_prefix(self, prefix) Sets the client library stat prefix value which gets prefixed to every metric. The default prefix is the “name” of the logger such as “object-server”, “container-auditor”, and so on. This is currently used to turn “proxy-server” into one of “proxy-server.Account”, “proxy-server.Container”, or “proxy-server.Object” as soon as the Controller object is determined and instantiated for the request.

update_stats(self, metric, amount,
                        sample_rate=1)

Increments the supplied metric by the given amount. This is used when you need to add or subtract more that one from a counter, like incrementing “suffix.hashes” by the number of computed hashes in the object replicator.

increment(self, metric,
                        sample_rate=1)

Increments the given counter metric by one.

decrement(self, metric,
                        sample_rate=1)

Lowers the given counter metric by one.

timing(self, metric, timing_ms,
                        sample_rate=1)

Record that the given metric took the supplied number of milliseconds.

timing_since(self, metric, orig_time,
                        sample_rate=1)

Convenience method to record a timing metric whose value is “now” minus an existing timestamp. Note that these logging methods may safely be called anywhere you have a logger object. If StatsD logging has not been configured, the methods are no-ops. This avoids messy conditional logic each place a metric is recorded. These example usages show the new logging methods: # swift/obj/replicator.py def update(self, job): # ... begin = time.time() try: hashed, local_hash = tpool.execute(tpooled_get_hashes, job['path'], do_listdir=(self.replication_count % 10) == 0, reclaim_age=self.reclaim_age) # See tpooled_get_hashes "Hack". if isinstance(hashed, BaseException): raise hashed self.suffix_hash += hashed self.logger.update_stats('suffix.hashes', hashed) # ... finally: self.partition_times.append(time.time() - begin) self.logger.timing_since('partition.update.timing', begin) # swift/container/updater.py def process_container(self, dbfile): # ... start_time = time.time() # ... for event in events: if 200 <= event.wait() < 300: successes += 1 else: failures += 1 if successes > failures: self.logger.increment('successes') # ... else: self.logger.increment('failures') # ... # Only track timing data for attempted updates: self.logger.timing_since('timing', start_time) else: self.logger.increment('no_changes') self.no_changes += 1 The development team of StatsD wanted to use the pystatsd client library (not to be confused with a similar-looking project also hosted on GitHub), but the released version on PyPI was missing two desired features the latest version in GitHub had: the ability to configure a metrics prefix in the client object and a convenience method for sending timing data between “now” and a “start” timestamp you already have. So they just implemented a simple StatsD client library from scratch with the same interface. This has the nice fringe benefit of not introducing another external library dependency into Object Storage.