openstack-manuals/doc/common/section_objectstorage-arch.xml

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE section [
<!ENTITY % openstack SYSTEM "entities/openstack.ent">
%openstack;
]>
<section xmlns="http://docbook.org/ns/docbook"
    xmlns:xi="http://www.w3.org/2001/XInclude"
    xmlns:xlink="http://www.w3.org/1999/xlink"
    version="5.0"
    xml:id="section_objectstorage-cluster-architecture">
    <title>Cluster architecture</title>
    <section xml:id="section_access-tier">
        <title>Access tier</title>
    <para>Large-scale deployments segment off an access tier, which is considered the Object Storage
            system's central hub. The access tier fields the incoming API requests from clients and
            moves data in and out of the system. This tier consists of front-end load balancers,
            ssl-terminators, and authentication services. It runs the (distributed) brain of the
            Object Storage system: the proxy server processes.</para>
    <figure>
        <title>Object Storage architecture</title>
        <mediaobject>
            <imageobject>
                <imagedata fileref="../common/figures/objectstorage-arch.png"/>
            </imageobject>
        </mediaobject>
    </figure>
        <para>Because access servers are collocated in their own tier, you can scale out read/write
            access regardless of the storage capacity. For example, if a cluster is on the public
            Internet, requires SSL termination, and has a high demand for data access, you can
            provision many access servers. However, if the cluster is on a private network and used
            primarily for archival purposes, you need fewer access servers.</para>
        <para>Since this is an HTTP addressable storage service, you may incorporate a load balancer
            into the access tier.</para>
        <para>Typically, the tier consists of a collection of 1U servers. These machines use a
            moderate amount of RAM and are network I/O intensive. Since these systems field each
            incoming API request, you should provision them with two high-throughput (10GbE)
            interfaces - one for the incoming "front-end" requests and the other for the "back-end"
            access to the object storage nodes to put and fetch data.</para>
        <section xml:id="section_access-tier-considerations">
            <title>Factors to consider</title>
            <para>For most publicly facing deployments as well as private deployments available
                across a wide-reaching corporate network, you use SSL to encrypt traffic to the
                client. SSL adds significant processing load to establish sessions between clients,
                which is why you have to provision more capacity in the access layer. SSL may not be
                required for private deployments on trusted networks.</para>
        </section>
    </section>
    <section xml:id="section_storage-nodes">
        <title>Storage nodes</title>
        <para>In most configurations, each of the five zones should have an equal amount of storage
            capacity. Storage nodes use a reasonable amount of memory and CPU. Metadata needs to be
            readily available to return objects quickly. The object stores run services not only to
            field incoming requests from the access tier, but to also run replicators, auditors, and
            reapers. You can provision object stores provisioned with single gigabit or 10 gigabit
            network interface depending on the expected workload and desired performance.</para>
    <figure>
        <title>Object Storage (swift)</title>
        <mediaobject>
            <imageobject>
                <imagedata fileref="../common/figures/objectstorage-nodes.png"/>
            </imageobject>
        </mediaobject>
    </figure>
        <para>Currently, a 2&nbsp;TB or 3&nbsp;TB SATA disk delivers
        good performance for the price. You can use desktop-grade
        drives if you have responsive remote hands in the datacenter
        and enterprise-grade drives if you don't.</para>
    <section xml:id="section_storage-nodes-considerations">
            <title>Factors to consider</title>
            <para>You should keep in mind the desired I/O performance for single-threaded requests .
                This system does not use RAID, so a single disk handles each request for an object.
                Disk performance impacts single-threaded response rates.</para>
            <para>To achieve apparent higher throughput, the object storage system is designed to
                handle concurrent uploads/downloads. The network I/O capacity (1GbE, bonded 1GbE
                pair, or 10GbE) should match your desired concurrent throughput needs for reads and
                writes.</para>
    </section>
</section>
</section>