ab93e636b8
Brief Summary: Added Modules and Lab Sections of Aptira's Existing OpenStack Training Docs. Please do refer Full Summary for more details. For those who want to review this and save some time on building it, I have hosted the content on http://office.aptira.in Please talk to Sean Robetrs if you are concerened about repetition of Doc Content or similar issues like short URLs etc., this is supposed to be a rough patch and not final. Full Summary: Added the following modules. 1. Module001 - Introduction To OpenStack. - Brief Overview of OpenStack. - Basic Concepts - Detailed Description of Core Projects (Grizzly) under OpenStack. - All But Networking and Swift. 2. Module002 - OpenStack Networking In detail. 3. Module003 - OpenStack Object Storage In detail. 4. Lab001 - OpenStack Control Node and Compute Node. 5. Lab002 - OpenStack Network Node. Full Summary added due to the size of the commit. I Apologize for the size of this commit and will try not to commit huge content like in this patch. The reason for the size of this commit is to meet OpenStack Training Sprint day. bp/training-manuals Change-Id: Ie3c44527992868b4d9571b66cc1c048e558ec669
220 lines
13 KiB
XML
220 lines
13 KiB
XML
<?xml version="1.0" encoding="utf-8"?>
|
||
<chapter xmlns="http://docbook.org/ns/docbook"
|
||
xmlns:xi="http://www.w3.org/2001/XInclude"
|
||
xmlns:xlink="http://www.w3.org/1999/xlink"
|
||
version="5.0"
|
||
xml:id="module003-ch006-more-concepts">
|
||
<title>A Bit More On Swift</title>
|
||
<para><guilabel>Containers and Objects</guilabel></para>
|
||
<para>A container is a storage compartment for your data and
|
||
provides a way for you to organize your data. You can
|
||
think of a container as a folder in Windows or a
|
||
directory in UNIX. The primary difference between a
|
||
container and these other file system concepts is that
|
||
containers cannot be nested. You can, however, create an
|
||
unlimited number of containers within your account. Data
|
||
must be stored in a container so you must have at least
|
||
one container defined in your account prior to uploading
|
||
data.</para>
|
||
<para>The only restrictions on container names is that they
|
||
cannot contain a forward slash (/) or an ascii null (%00)
|
||
and must be less than 257 bytes in length. Please note
|
||
that the length restriction applies to the name after it
|
||
has been URL encoded. For example, a container name of
|
||
Course Docs would be URL encoded as Course%20Docs and
|
||
therefore be 13 bytes in length rather than the expected
|
||
11.</para>
|
||
<para>An object is the basic storage entity and any optional
|
||
metadata that represents the files you store in the
|
||
OpenStack Object Storage system. When you upload data to
|
||
OpenStack Object Storage, the data is stored as-is (no
|
||
compression or encryption) and consists of a location
|
||
(container), the object's name, and any metadata
|
||
consisting of key/value pairs. For instance, you may chose
|
||
to store a backup of your digital photos and organize them
|
||
into albums. In this case, each object could be tagged
|
||
with metadata such as Album : Caribbean Cruise or Album :
|
||
Aspen Ski Trip.</para>
|
||
<para>The only restriction on object names is that they must
|
||
be less than 1024 bytes in length after URL encoding. For
|
||
example, an object name of C++final(v2).txt should be URL
|
||
encoded as C%2B%2Bfinal%28v2%29.txt and therefore be 24
|
||
bytes in length rather than the expected 16.</para>
|
||
<para>The maximum allowable size for a storage object upon
|
||
upload is 5 gigabytes (GB) and the minimum is zero bytes.
|
||
You can use the built-in large object support and the
|
||
swift utility to retrieve objects larger than 5 GB.</para>
|
||
<para>For metadata, you should not exceed 90 individual
|
||
key/value pairs for any one object and the total byte
|
||
length of all key/value pairs should not exceed 4KB (4096
|
||
bytes).</para>
|
||
<para><guilabel>Language-Specific API
|
||
Bindings</guilabel></para>
|
||
<para>A set of supported API bindings in several popular
|
||
languages are available from the Rackspace Cloud Files
|
||
product, which uses OpenStack Object Storage code for its
|
||
implementation. These bindings provide a layer of
|
||
abstraction on top of the base REST API, allowing
|
||
programmers to work with a container and object model
|
||
instead of working directly with HTTP requests and
|
||
responses. These bindings are free (as in beer and as in
|
||
speech) to download, use, and modify. They are all
|
||
licensed under the MIT License as described in the COPYING
|
||
file packaged with each binding. If you do make any
|
||
improvements to an API, you are encouraged (but not
|
||
required) to submit those changes back to us.</para>
|
||
<para>The API bindings for Rackspace Cloud Files are hosted
|
||
at<link xlink:href="http://github.com/rackspace"
|
||
></link><link
|
||
xlink:href="http://github.com/rackspace"
|
||
>http://github.com/rackspace</link>. Feel free to
|
||
coordinate your changes through github or, if you prefer,
|
||
send your changes to cloudfiles@rackspacecloud.com. Just
|
||
make sure to indicate which language and version you
|
||
modified and send a unified diff.</para>
|
||
<para>Each binding includes its own documentation (either
|
||
HTML, PDF, or CHM). They also include code snippets and
|
||
examples to help you get started. The currently supported
|
||
API binding for OpenStack Object Storage are:</para>
|
||
<itemizedlist>
|
||
<listitem>
|
||
<para>PHP (requires 5.x and the modules: cURL,
|
||
FileInfo, mbstring)</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>Python (requires 2.4 or newer)</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>Java (requires JRE v1.5 or newer)</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>C#/.NET (requires .NET Framework v3.5)</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>Ruby (requires 1.8 or newer and mime-tools
|
||
module)</para>
|
||
</listitem>
|
||
</itemizedlist>
|
||
<para>There are no other supported language-specific bindings
|
||
at this time. You are welcome to create your own language
|
||
API bindings and we can help answer any questions during
|
||
development, host your code if you like, and give you full
|
||
credit for your work.</para>
|
||
<para><guilabel>Proxy Server</guilabel></para>
|
||
<para>The Proxy Server is responsible for tying together
|
||
the rest of the OpenStack Object Storage architecture.
|
||
For each request, it will look up the location of the
|
||
account, container, or object in the ring (see below)
|
||
and route the request accordingly. The public API is
|
||
also exposed through the Proxy Server.</para>
|
||
<para>A large number of failures are also handled in the
|
||
Proxy Server. For example, if a server is unavailable
|
||
for an object PUT, it will ask the ring for a hand-off
|
||
server and route there instead.</para>
|
||
<para>When objects are streamed to or from an object
|
||
server, they are streamed directly through the proxy
|
||
server to or from the user – the proxy server does not
|
||
spool them.</para>
|
||
<para>You can use a proxy server with account management
|
||
enabled by configuring it in the proxy server
|
||
configuration file.</para>
|
||
<para><guilabel>Object Server</guilabel></para>
|
||
<para>The Object Server is a very simple blob storage
|
||
server that can store, retrieve and delete objects
|
||
stored on local devices. Objects are stored as binary
|
||
files on the filesystem with metadata stored in the
|
||
file’s extended attributes (xattrs). This requires
|
||
that the underlying filesystem choice for object
|
||
servers support xattrs on files. Some filesystems,
|
||
like ext3, have xattrs turned off by default.</para>
|
||
<para>Each object is stored using a path derived from the
|
||
object name’s hash and the operation’s timestamp. Last
|
||
write always wins, and ensures that the latest object
|
||
version will be served. A deletion is also treated as
|
||
a version of the file (a 0 byte file ending with
|
||
“.ts”, which stands for tombstone). This ensures that
|
||
deleted files are replicated correctly and older
|
||
versions don’t magically reappear due to failure
|
||
scenarios.</para>
|
||
<para><guilabel>Container Server</guilabel></para>
|
||
<para>The Container Server’s primary job is to handle
|
||
listings of objects. It does not’t know where those
|
||
objects are, just what objects are in a specific
|
||
container. The listings are stored as sqlite database
|
||
files, and replicated across the cluster similar to
|
||
how objects are. Statistics are also tracked that
|
||
include the total number of objects, and total storage
|
||
usage for that container.</para>
|
||
<para><guilabel>Account Server</guilabel></para>
|
||
<para>The Account Server is very similar to the Container
|
||
Server, excepting that it is responsible for listings
|
||
of containers rather than objects.</para>
|
||
<para><guilabel>Replication</guilabel></para>
|
||
<para>Replication is designed to keep the system in a
|
||
consistent state in the face of temporary error
|
||
conditions like network outages or drive
|
||
failures.</para>
|
||
<para>The replication processes compare local data with
|
||
each remote copy to ensure they all contain the latest
|
||
version. Object replication uses a hash list to
|
||
quickly compare subsections of each partition, and
|
||
container and account replication use a combination of
|
||
hashes and shared high water marks.</para>
|
||
<para>Replication updates are push based. For object
|
||
replication, updating is just a matter of rsyncing
|
||
files to the peer. Account and container replication
|
||
push missing records over HTTP or rsync whole database
|
||
files.</para>
|
||
<para>The replicator also ensures that data is removed
|
||
from the system. When an item (object, container, or
|
||
account) is deleted, a tombstone is set as the latest
|
||
version of the item. The replicator will see the
|
||
tombstone and ensure that the item is removed from the
|
||
entire system.</para>
|
||
<para>To separate the cluster-internal replication traffic
|
||
from client traffic, separate replication servers can
|
||
be used. These replication servers are based on the
|
||
standard storage servers, but they listen on the
|
||
replication IP and only respond to REPLICATE requests.
|
||
Storage servers can serve REPLICATE requests, so an
|
||
operator can transition to using a separate
|
||
replication network with no cluster downtime.</para>
|
||
<para>Replication IP and port information is stored in the
|
||
ring on a per-node basis. These parameters will be
|
||
used if they are present, but they are not required.
|
||
If this information does not exist or is empty for a
|
||
particular node, the node's standard IP and port will
|
||
be used for replication.</para>
|
||
<para><guilabel>Updaters</guilabel></para>
|
||
<para>There are times when container or account data can
|
||
not be immediately updated. This usually occurs during
|
||
failure scenarios or periods of high load. If an
|
||
update fails, the update is queued locally on the file
|
||
system, and the updater will process the failed
|
||
updates. This is where an eventual consistency window
|
||
will most likely come in to play. For example, suppose
|
||
a container server is under load and a new object is
|
||
put in to the system. The object will be immediately
|
||
available for reads as soon as the proxy server
|
||
responds to the client with success. However, the
|
||
container server did not update the object listing,
|
||
and so the update would be queued for a later update.
|
||
Container listings, therefore, may not immediately
|
||
contain the object.</para>
|
||
<para>In practice, the consistency window is only as large
|
||
as the frequency at which the updater runs and may not
|
||
even be noticed as the proxy server will route listing
|
||
requests to the first container server which responds.
|
||
The server under load may not be the one that serves
|
||
subsequent listing requests – one of the other two
|
||
replicas may handle the listing.</para>
|
||
<para><guilabel>Auditors</guilabel></para>
|
||
<para>Auditors crawl the local server checking the
|
||
integrity of the objects, containers, and accounts. If
|
||
corruption is found (in the case of bit rot, for
|
||
example), the file is quarantined, and replication
|
||
will replace the bad file from another replica. If
|
||
other errors are found they are logged (for example,
|
||
an object’s listing can’t be found on any container
|
||
server it should be).</para>
|
||
</chapter> |