Cluster Arch Access Tier
Swift Cluster Architecture
Large-scale deployments segment off an "Access Tier". This tier is the “Grand Central” of the Object Storage system. It fields incoming API requests from clients and moves data in and out of the system. This tier is composed of front-end load balancers, ssl- terminators, authentication services, and it runs the (distributed) brain of the object storage system — the proxy server processes. Having the access servers in their own tier enables read/write access to be scaled out independently of storage capacity. For example, if the cluster is on the public Internet and requires ssl-termination and has high demand for data access, many access servers can be provisioned. However, if the cluster is on a private network and it is being used primarily for archival purposes, fewer access servers are needed. As this is an HTTP addressable storage service, a load balancer can be incorporated into the access tier. Typically, this tier comprises a collection of 1U servers. These machines use a moderate amount of RAM and are network I/O intensive. As these systems field each incoming API request, it is wise to provision them with two high-throughput (10GbE) interfaces. One interface is used for 'front-end' incoming requests and the other for 'back-end' access to the object storage nodes to put and fetch data. Factors to Consider For most publicly facing deployments as well as private deployments available across a wide-reaching corporate network, SSL will be used to encrypt traffic to the client. SSL adds significant processing load to establish sessions between clients; more capacity in the access layer will need to be provisioned. SSL may not be required for private deployments on trusted networks. Storage Nodes
Object Storage (Swift)
The next component is the storage servers themselves. Generally, most configurations should have each of the five Zones with an equal amount of storage capacity. Storage nodes use a reasonable amount of memory and CPU. Metadata needs to be readily available to quickly return objects. The object stores run services not only to field incoming requests from the Access Tier, but to also run replicators, auditors, and reapers. Object stores can be provisioned with single gigabit or 10 gigabit network interface depending on expected workload and desired performance. Currently 2TB or 3TB SATA disks deliver good price/performance value. Desktop-grade drives can be used where there are responsive remote hands in the datacenter, and enterprise-grade drives can be used where this is not the case. Factors to Consider Desired I/O performance for single-threaded requests should be kept in mind. This system does not use RAID, so each request for an object is handled by a single disk. Disk performance impacts single-threaded response rates. To achieve apparent higher throughput, the object storage system is designed with concurrent uploads/downloads in mind. The network I/O capacity (1GbE, bonded 1GbE pair, or 10GbE) should match your desired concurrent throughput needs for reads and writes.