Configure Object Storage featuresObject Storage zonesIn OpenStack Object Storage, data is placed across
different tiers of failure domains. First, data is spread
across regions, then zones, then servers, and finally
across drives. Data is placed to get the highest failure
domain isolation. If you deploy multiple regions, the
Object Storage service places the data across the regions.
Within a region, each replica of the data should be stored
in unique zones, if possible. If there is only one zone,
data should be placed on different servers. And if there
is only one server, data should be placed on different
drives.Regions are widely separated installations with a
high-latency or otherwise constrained network link between
them. Zones are arbitrarily assigned, and it is up to the
administrator of the Object Storage cluster to choose an
isolation level and attempt to maintain the isolation
level through appropriate zone assignment. For example, a
zone may be defined as a rack with a single power source.
Or a zone may be a DC room with a common utility provider.
Servers are identified by a unique IP/port. Drives are
locally attached storage volumes identified by mount
point.In small clusters (five nodes or fewer), everything is
normally in a single zone. Larger Object Storage
deployments may assign zone designations differently; for
example, an entire cabinet or rack of servers may be
designated as a single zone to maintain replica
availability if the cabinet becomes unavailable (for
example, due to failure of the top of rack switches or a
dedicated circuit). In very large deployments, such as
service provider level deployments, each zone might have
an entirely autonomous switching and power infrastructure,
so that even the loss of an electrical circuit or
switching aggregator would result in the loss of a single
replica at most.Rackspace zone recommendationsFor ease of maintenance on OpenStack Object Storage,
Rackspace recommends that you set up at least five
nodes. Each node is assigned its own zone (for a total
of five zones), which gives you host level redundancy.
This enables you to take down a single zone for
maintenance and still guarantee object availability in
the event that another zone fails during your
maintenance.You could keep each server in its own cabinet to
achieve cabinet level isolation, but you may wish to
wait until your swift service is better established
before developing cabinet-level isolation. OpenStack
Object Storage is flexible; if you later decide to
change the isolation level, you can take down one zone
at a time and move them to appropriate new homes.
RAID controller configurationOpenStack Object Storage does not require RAID. In fact,
most RAID configurations cause significant performance
degradation. The main reason for using a RAID controller
is the battery-backed cache. It is very important for data
integrity reasons that when the operating system confirms
a write has been committed that the write has actually
been committed to a persistent location. Most disks lie
about hardware commits by default, instead writing to a
faster write cache for performance reasons. In most cases,
that write cache exists only in non-persistent memory. In
the case of a loss of power, this data may never actually
get committed to disk, resulting in discrepancies that the
underlying file system must handle.OpenStack Object Storage works best on the XFS file
system, and this document assumes that the hardware being
used is configured appropriately to be mounted with the
nobarriers option. For more
information, refer to the XFS FAQ: http://xfs.org/index.php/XFS_FAQ
To get the most out of your hardware, it is essential
that every disk used in OpenStack Object Storage is
configured as a standalone, individual RAID 0 disk; in the
case of 6 disks, you would have six RAID 0s or one JBOD.
Some RAID controllers do not support JBOD or do not
support battery backed cache with JBOD. To ensure the
integrity of your data, you must ensure that the
individual drive caches are disabled and the battery
backed cache in your RAID card is configured and used.
Failure to configure the controller properly in this case
puts data at risk in the case of sudden loss of
power.You can also use hybrid drives or similar options for
battery backed up cache configurations without a RAID
controller.Throttle resources through rate limitsRate limiting in OpenStack Object Storage is implemented
as a pluggable middleware that you configure on the proxy
server. Rate limiting is performed on requests that result
in database writes to the account and container SQLite
databases. It uses memcached and is dependent on the proxy
servers having highly synchronized time. The rate limits
are limited by the accuracy of the proxy server
clocks.Configure rate limitingAll configuration is optional. If no account or
container limits are provided, no rate limiting
occurs. Available configuration options
include:The container rate limits are linearly interpolated
from the values given. A sample container rate
limiting could be:container_ratelimit_100 = 100container_ratelimit_200 = 50container_ratelimit_500 = 20This would result in:
Values for Rate Limiting with Sample
Configuration Settings
Container Size
Rate Limit
0-99
No limiting
100
100
150
75
500
20
1000
20
Health checkProvides an easy way to monitor whether the swift proxy
server is alive. If you access the proxy with the path
/healthcheck, it respond
OK in the response body, which
monitoring tools can use.Domain remapMiddleware that translates container and account parts
of a domain to path parameters that the proxy server
understands.CNAME lookupMiddleware that translates an unknown domain in the host
header to something that ends with the configured
storage_domain by looking up the given domain's CNAME
record in DNS.Temporary URLAllows the creation of URLs to provide temporary access
to objects. For example, a website may wish to provide a
link to download a large object in Swift, but the Swift
account has no public access. The website can generate a
URL that provides GET access for a limited time to the
resource. When the web browser user clicks on the link,
the browser downloads the object directly from Swift,
eliminating the need for the website to act as a proxy for
the request. If the user shares the link with all his
friends, or accidentally posts it on a forum, the direct
access is limited to the expiration time set when the
website created the link.A temporary URL is the typical URL associated with an
object, with two additional query parameters:temp_url_sigA cryptographic signaturetemp_url_expiresAn expiration date, in Unix time.An example of a temporary
URL:
https://swift-cluster.example.com/v1/AUTH_a422b2-91f3-2f46-74b7-d7c9e8958f5d30/container/object?
temp_url_sig=da39a3ee5e6b4b0d3255bfef95601890afd80709&
temp_url_expires=1323479485
To create temporary URLs, first set the
X-Account-Meta-Temp-URL-Key header
on your Swift account to an arbitrary string. This string
serves as a secret key. For example, to set a key of
b3968d0207b54ece87cccc06515a89d4
using the swift command-line
tool:$swift post -m "Temp-URL-Key:b3968d0207b54ece87cccc06515a89d4"Next, generate an HMAC-SHA1 (RFC 2104) signature to
specify:Which HTTP method to allow (typically
GET or
PUT)The expiry date as a Unix timestampthe full path to the objectThe secret key set as the
X-Account-Meta-Temp-URL-KeyHere is code generating the signature for a GET for 24
hours on
/v1/AUTH_account/container/object:import hmac
from hashlib import sha1
from time import time
method = 'GET'
duration_in_seconds = 60*60*24
expires = int(time() + duration_in_seconds)
path = '/v1/AUTH_a422b2-91f3-2f46-74b7-d7c9e8958f5d30/container/object'
key = 'mykey'
hmac_body = '%s\n%s\n%s' % (method, expires, path)
sig = hmac.new(key, hmac_body, sha1).hexdigest()
s = 'https://{host}/{path}?temp_url_sig={sig}&temp_url_expires={expires}'
url = s.format(host='swift-cluster.example.com', path=path, sig=sig, expires=expires)Any alteration of the resource path or query arguments
results in a 401Unauthorized error. Similarly, a
PUT where GET was the allowed method returns a
401. HEAD is allowed if GET or
PUT is allowed. Using this in combination with browser
form post translation middleware could also allow
direct-from-browser uploads to specific locations in
Swift. Note that Changing the
X-Account-Meta-Temp-URL-Key
invalidates any previously generated temporary
URLs within 60 seconds (the memcache time for the
key). Swift supports up to two keys, specified by
X-Account-Meta-Temp-URL-Key
and
X-Account-Meta-Temp-URL-Key-2.
Signatures are checked against both keys, if
present. This is to allow for key rotation without
invalidating all existing temporary URLs.Swift includes a script called
swift-temp-url that generates the
query parameters automatically:$bin/swift-temp-url GET 3600 /v1/AUTH_account/container/object mykey/v1/AUTH_account/container/object?
temp_url_sig=5c4cc8886f36a9d0919d708ade98bf0cc71c9e91&
temp_url_expires=1374497657Because this command only returns the path, you must
prefix the Swift storage host name (for example,
https://swift-cluster.example.com).With GET Temporary URLs, a
Content-Disposition header is set
on the response so that browsers interpret this as a file
attachment to be saved. The file name chosen is based on
the object name, but you can override this with a
filename query parameter. The
following example specifies a filename of My
Test File.pdf:https://swift-cluster.example.com/v1/AUTH_a422b2-91f3-2f46-74b7-d7c9e8958f5d30/container/object?
temp_url_sig=da39a3ee5e6b4b0d3255bfef95601890afd80709&
temp_url_expires=1323479485&
filename=My+Test+File.pdfTo enable Temporary URL functionality, edit
/etc/swift/proxy-server.conf to
add tempurl to the
pipeline variable defined in the
[pipeline:main] section. The
tempurl entry should appear
immediately before the authentication filters in the
pipeline, such as authtoken,
tempauth or
keystoneauth. For
example:[pipeline:main]
pipeline = pipeline = healthcheck cache tempurl authtoken keystoneauth proxy-serverName check filterName Check is a filter that disallows any paths that
contain defined forbidden characters or that exceed a
defined length.ConstraintsTo change the OpenStack Object Storage internal limits,
update the values in the
swift-constraints section in the
swift.conf file. Use caution when
you update these values because they affect the
performance in the entire cluster.Cluster healthUse the swift-dispersion-report tool
to measure overall cluster health. This tool checks if a
set of deliberately distributed containers and objects are
currently in their proper places within the cluster. For
instance, a common deployment has three replicas of each
object. The health of that object can be measured by
checking if each replica is in its proper place. If only 2
of the 3 is in place the object’s health can be said to be
at 66.66%, where 100% would be perfect. A single object’s
health, especially an older object, usually reflects the
health of that entire partition the object is in. If you
make enough objects on a distinct percentage of the
partitions in the cluster,you get a good estimate of the
overall cluster health. In practice, about 1% partition
coverage seems to balance well between accuracy and the
amount of time it takes to gather results. The first thing
that needs to be done to provide this health value is
create a new account solely for this usage. Next, you need
to place the containers and objects throughout the system
so that they are on distinct partitions. The
swift-dispersion-populate tool does this by making up
random container and object names until they fall on
distinct partitions. Last, and repeatedly for the life of
the cluster, you must run the
swift-dispersion-report tool to
check the health of each of these containers and objects.
These tools need direct access to the entire cluster and
to the ring files (installing them on a proxy server
suffices). The
swift-dispersion-populate and
swift-dispersion-report commands
both use the same configuration file,
/etc/swift/dispersion.conf.
Example dispersion.conf file:
[dispersion]
auth_url = http://localhost:8080/auth/v1.0
auth_user = test:tester
auth_key = testing
There are also configuration options for specifying the
dispersion coverage, which defaults to 1%, retries,
concurrency, and so on. However, the defaults are usually
fine. Once the configuration is in place, run
swift-dispersion-populate to
populate the containers and objects throughout the
cluster. Now that those containers and objects are in
place, you can run
swift-dispersion-report to get a
dispersion report, or the overall health of the cluster.
Here is an example of a cluster in perfect health:$swift-dispersion-reportQueried 2621 containers for dispersion reporting, 19s, 0 retries
100.00% of container copies found (7863 of 7863)
Sample represents 1.00% of the container partition space
Queried 2619 objects for dispersion reporting, 7s, 0 retries
100.00% of object copies found (7857 of 7857)
Sample represents 1.00% of the object partition space
Now, deliberately double the weight of a device in the
object ring (with replication turned off) and re-run the
dispersion report to show what impact that has:$swift-ring-builder object.builder set_weight d0 200$swift-ring-builder object.builder rebalance
...
$swift-dispersion-reportQueried 2621 containers for dispersion reporting, 8s, 0 retries
100.00% of container copies found (7863 of 7863)
Sample represents 1.00% of the container partition space
Queried 2619 objects for dispersion reporting, 7s, 0 retries
There were 1763 partitions missing one copy.
77.56% of object copies found (6094 of 7857)
Sample represents 1.00% of the object partition space
You can see the health of the objects in the cluster has
gone down significantly. Of course, this test environment
has just four devices, in a production environment with
many devices the impact of one device change is much less.
Next, run the replicators to get everything put back into
place and then rerun the dispersion report:
... start object replicators and monitor logs until they're caught up ...
$ swift-dispersion-report
Queried 2621 containers for dispersion reporting, 17s, 0 retries
100.00% of container copies found (7863 of 7863)
Sample represents 1.00% of the container partition space
Queried 2619 objects for dispersion reporting, 7s, 0 retries
100.00% of object copies found (7857 of 7857)
Sample represents 1.00% of the object partition space
Alternatively, the dispersion report can also be output
in json format. This allows it to be more easily consumed
by third party utilities:$swift-dispersion-report -j{"object": {"retries:": 0, "missing_two": 0, "copies_found": 7863, "missing_one": 0,
"copies_expected": 7863, "pct_found": 100.0, "overlapping": 0, "missing_all": 0}, "container":
{"retries:": 0, "missing_two": 0, "copies_found": 12534, "missing_one": 0, "copies_expected":
12534, "pct_found": 100.0, "overlapping": 15, "missing_all": 0}}Static Large Object (SLO) supportThis feature is very similar to Dynamic Large Object
(DLO) support in that it enables the user to upload many
objects concurrently and afterwards download them as a
single object. It is different in that it does not rely on
eventually consistent container listings to do so.
Instead, a user defined manifest of the object segments is
used.Container quotasThe container_quotas middleware implements simple quotas
that can be imposed on swift containers by a user with the
ability to set container metadata, most likely the account
administrator. This can be useful for limiting the scope
of containers that are delegated to non-admin users,
exposed to formpost uploads, or just as a self-imposed
sanity check.Any object PUT operations that exceed these quotas
return a 413 response (request entity too large) with a
descriptive body.Quotas are subject to several limitations: eventual
consistency, the timeliness of the cached container_info
(60 second ttl by default), and it is unable to reject
chunked transfer uploads that exceed the quota (though
once the quota is exceeded, new chunked transfers are
refused).Set quotas by adding meta values to the container. These
values are validated when you set them:X-Container-Meta-Quota-Bytes: Maximum size of
the container, in bytes.X-Container-Meta-Quota-Count: Maximum object
count of the container.Account quotasThe x-account-meta-quota-bytes
metadata entry must be requests (PUT, POST) if a given
account quota (in bytes) is exceeded while DELETE requests
are still allowed.The x-account-meta-quota-bytes metadata entry must be
set to store and enable the quota. Write requests to this
metadata entry are only permitted for resellers. There is
no account quota limitation on a reseller account even if
x-account-meta-quota-bytes is set.Any object PUT operations that exceed the quota return a
413 response (request entity too large) with a descriptive
body.The following command uses an admin account that own the
Reseller role to set a quota on the test account:$swift -A http://127.0.0.1:8080/auth/v1.0 -U admin:admin -K admin \
--os-storage-url=http://127.0.0.1:8080/v1/AUTH_test post -m quota-bytes:10000Here is the stat listing of an account where quota has
been set:$swift -A http://127.0.0.1:8080/auth/v1.0 -U test:tester -K testing statAccount: AUTH_test
Containers: 0
Objects: 0
Bytes: 0
Meta Quota-Bytes: 10000
X-Timestamp: 1374075958.37454
X-Trans-Id: tx602634cf478546a39b1be-0051e6bc7aThis command removes the account quota:$swift -A http://127.0.0.1:8080/auth/v1.0 -U admin:admin -K admin --os-storage-url=http://127.0.0.1:8080/v1/AUTH_test post -m quota-bytes:Bulk deleteUse bulk-delete to delete multiple files from an account
with a single request. Responds to DELETE requests with a
header 'X-Bulk-Delete: true_value'. The body of the DELETE
request is a new line separated list of files to delete.
The files listed must be URL encoded and in the
form:
/container_name/obj_name
If all files are successfully deleted (or did not
exist), the operation returns HTTPOk. If any files failed
to delete, the operation returns HTTPBadGateway. In both
cases the response body is a JSON dictionary that shows
the number of files that were successfully deleted or not
found. The files that failed are listed.Drive auditThe configuration
items reference a script that can be run by using
cron to watch for bad drives. If
errors are detected, it unmounts the bad drive, so that
OpenStack Object Storage can work around it. It takes the
following options:Form postMiddleware that provides the ability to upload objects
to a cluster using an HTML form POST. The format of the
form is:<![CDATA[
<form action="<swift-url>" method="POST"
enctype="multipart/form-data">
<input type="hidden" name="redirect" value="<redirect-url>" />
<input type="hidden" name="max_file_size" value="<bytes>" />
<input type="hidden" name="max_file_count" value="<count>" />
<input type="hidden" name="expires" value="<unix-timestamp>" />
<input type="hidden" name="signature" value="<hmac>" />
<input type="file" name="file1" /><br />
<input type="submit" />
</form>]]>
The swift-url is the URL to the Swift
destination, such as:
https://swift-cluster.example.com/v1/AUTH_account/container/object_prefix
The name of each file uploaded is appended to the
specified swift-url. So, you can upload
directly to the root of container with a url like:
https://swift-cluster.example.com/v1/AUTH_account/container/
Optionally, you can include an object prefix to better
separate different users’ uploads, such as:
https://swift-cluster.example.com/v1/AUTH_account/container/object_prefixThe form method must be POST and the enctype must be
set as multipart/form-data.The redirect attribute is the URL to redirect the
browser to after the upload completes. The URL has status
and message query parameters added to it, indicating the
HTTP status code for the upload (2xx is success) and a
possible message for further information if there was an
error (such as “max_file_size
exceeded”).The max_file_size attribute must be
included and indicates the largest single file upload that
can be done, in bytes.The max_file_count attribute must be
included and indicates the maximum number of files that
can be uploaded with the form. Include additional
<![CDATA[<input type="file"
name="filexx"/>]]> attributes if
desired.The expires attribute is the Unix timestamp before which
the form must be submitted before it is
invalidated.The signature attribute is the HMAC-SHA1 signature of
the form. This sample Python code shows how to compute the
signature:
import hmac
from hashlib import sha1
from time import time
path = '/v1/account/container/object_prefix'
redirect = 'https://myserver.com/some-page'
max_file_size = 104857600
max_file_count = 10
expires = int(time() + 600)
key = 'mykey'
hmac_body = '%s\n%s\n%s\n%s\n%s' % (path, redirect,
max_file_size, max_file_count, expires)
signature = hmac.new(key, hmac_body, sha1).hexdigest()
The key is the value of the
X-Account-Meta-Temp-URL-Key header
on the account.Be certain to use the full path, from the
/v1/ onward.The command line tool
swift-form-signature may be used
(mostly just when testing) to compute expires and
signature.The file attributes must appear after the other
attributes to be processed correctly. If attributes come
after the file, they are not sent with the sub-request
because on the server side, all attributes in the file
cannot be parsed unless the whole file is read into memory
and the server does not have enough memory to service
these requests. So, attributes that follow the file are
ignored.Static web sitesWhen configured, this middleware serves container data
as a static web site with index file and error file
resolution and optional file listings. This mode is
normally only active for anonymous requests.