[arch-design] Revise storage design section
1. Rearchitect content and remove duplication 2. Temporarily move use case storage requirements to the Use cases chapter Change-Id: I4cca7e1cc3b857383f1e9fae4626f0f538c13243 Implements: blueprint arch-design-pike
This commit is contained in:
parent
0a2315e464
commit
5b38fe4828
@ -3,27 +3,11 @@ Storage design
|
||||
==============
|
||||
|
||||
Storage is found in many parts of the OpenStack cloud environment. This
|
||||
chapter describes persistent storage options you can configure with
|
||||
your cloud.
|
||||
|
||||
General
|
||||
~~~~~~~
|
||||
chapter describes storage type, design considerations and options when
|
||||
selecting persistent storage options for your cloud environment.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
design-storage/design-storage-concepts
|
||||
design-storage/design-storage-planning-scaling.rst
|
||||
|
||||
|
||||
Storage types
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
design-storage/design-storage-block
|
||||
design-storage/design-storage-object
|
||||
design-storage/design-storage-file
|
||||
design-storage/design-storage-commodity
|
||||
|
||||
design-storage/design-storage-arch
|
||||
|
@ -1,17 +1,35 @@
|
||||
=====================================
|
||||
Storage capacity planning and scaling
|
||||
=====================================
|
||||
====================
|
||||
Storage architecture
|
||||
====================
|
||||
|
||||
An important consideration in running a cloud over time is projecting growth
|
||||
and utilization trends in order to plan capital expenditures for the short and
|
||||
long term. Gather utilization meters for compute, network, and storage, along
|
||||
with historical records of these meters. While securing major anchor tenants
|
||||
can lead to rapid jumps in the utilization of resources, the average rate of
|
||||
adoption of cloud services through normal usage also needs to be carefully
|
||||
monitored.
|
||||
Choosing storage back ends
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. TODO how to decide (encryption, SLA requirements, live migration
|
||||
availability)
|
||||
|
||||
Users will indicate different needs for their cloud architecture. Some may
|
||||
need fast access to many objects that do not change often, or want to
|
||||
set a time-to-live (TTL) value on a file. Others may access only storage
|
||||
that is mounted with the file system itself, but want it to be
|
||||
replicated instantly when starting a new instance. For other systems,
|
||||
ephemeral storage is the preferred choice. When you select
|
||||
:term:`storage back ends <storage back end>`,
|
||||
consider the following questions from user's perspective:
|
||||
|
||||
* Do I need block storage?
|
||||
* Do I need object storage?
|
||||
* Do I need to support live migration?
|
||||
* Should my persistent storage drives be contained in my compute nodes,
|
||||
or should I use external storage?
|
||||
* What is the platter count I can achieve? Do more spindles result in
|
||||
better I/O despite network access?
|
||||
* Which one results in the best cost-performance scenario I'm aiming for?
|
||||
* How do I manage the storage operationally?
|
||||
* How redundant and distributed is the storage? What happens if a
|
||||
storage node fails? To what extent can it mitigate my data-loss
|
||||
disaster scenarios?
|
||||
|
||||
General storage considerations
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A wide variety of operator-specific requirements dictates the nature of the
|
||||
storage back end. Examples of such requirements are as follows:
|
||||
|
||||
@ -23,135 +41,123 @@ We recommend that data be encrypted both in transit and at-rest.
|
||||
If you plan to use live migration, a shared storage configuration is highly
|
||||
recommended.
|
||||
|
||||
Capacity planning for a multi-site cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
An OpenStack cloud can be designed in a variety of ways to handle individual
|
||||
application needs. A multi-site deployment has additional challenges compared
|
||||
to single site installations.
|
||||
To deploy your storage by using only commodity hardware, you can use a number
|
||||
of open-source packages, as shown in :ref:`table_persistent_file_storage`.
|
||||
|
||||
When determining capacity options, take into account technical, economic and
|
||||
operational issues that might arise from specific decisions.
|
||||
.. _table_persistent_file_storage:
|
||||
|
||||
Inter-site link capacity describes the connectivity capability between
|
||||
different OpenStack sites. This includes parameters such as
|
||||
bandwidth, latency, whether or not a link is dedicated, and any business
|
||||
policies applied to the connection. The capability and number of the
|
||||
links between sites determine what kind of options are available for
|
||||
deployment. For example, if two sites have a pair of high-bandwidth
|
||||
links available between them, it may be wise to configure a separate
|
||||
storage replication network between the two sites to support a single
|
||||
swift endpoint and a shared Object Storage capability between them. An
|
||||
example of this technique, as well as a configuration walk-through, is
|
||||
available at `Dedicated replication network
|
||||
<https://docs.openstack.org/developer/swift/replication_network.html#dedicated-replication-network>`_.
|
||||
Another option in this scenario is to build a dedicated set of tenant
|
||||
private networks across the secondary link, using overlay networks with
|
||||
a third party mapping the site overlays to each other.
|
||||
.. list-table:: Persistent file-based storage support
|
||||
:widths: 25 25 25 25
|
||||
:header-rows: 1
|
||||
|
||||
The capacity requirements of the links between sites is driven by
|
||||
application behavior. If the link latency is too high, certain
|
||||
applications that use a large number of small packets, for example
|
||||
:term:`RPC <Remote Procedure Call (RPC)>` API calls, may encounter
|
||||
issues communicating with each other or operating
|
||||
properly. OpenStack may also encounter similar types of issues.
|
||||
To mitigate this, the Identity service provides service call timeout
|
||||
tuning to prevent issues authenticating against a central Identity services.
|
||||
* -
|
||||
- Object
|
||||
- Block
|
||||
- File-level
|
||||
* - Swift
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
-
|
||||
* - LVM
|
||||
-
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
* - Ceph
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- Experimental
|
||||
* - Gluster
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
* - NFS
|
||||
-
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
* - ZFS
|
||||
-
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
* - Sheepdog
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
|
||||
Another network capacity consideration for a multi-site deployment is
|
||||
the amount and performance of overlay networks available for tenant
|
||||
networks. If using shared tenant networks across zones, it is imperative
|
||||
that an external overlay manager or controller be used to map these
|
||||
overlays together. It is necessary to ensure the amount of possible IDs
|
||||
between the zones are identical.
|
||||
This list of open source file-level shared storage solutions is not
|
||||
exhaustive. Your organization may already have deployed a file-level shared
|
||||
storage solution that you can use.
|
||||
|
||||
.. note::
|
||||
|
||||
As of the Kilo release, OpenStack Networking was not capable of
|
||||
managing tunnel IDs across installations. So if one site runs out of
|
||||
IDs, but another does not, that tenant's network is unable to reach
|
||||
the other site.
|
||||
**Storage driver support**
|
||||
|
||||
The ability for a region to grow depends on scaling out the number of
|
||||
available compute nodes. However, it may be necessary to grow cells in an
|
||||
individual region, depending on the size of your cluster and the ratio of
|
||||
virtual machines per hypervisor.
|
||||
In addition to the open source technologies, there are a number of
|
||||
proprietary solutions that are officially supported by OpenStack Block
|
||||
Storage. You can find a matrix of the functionality provided by all of the
|
||||
supported Block Storage drivers on the `CinderSupportMatrix
|
||||
wiki <https://wiki.openstack.org/wiki/CinderSupportMatrix>`_.
|
||||
|
||||
A third form of capacity comes in the multi-region-capable components of
|
||||
OpenStack. Centralized Object Storage is capable of serving objects
|
||||
through a single namespace across multiple regions. Since this works by
|
||||
accessing the object store through swift proxy, it is possible to
|
||||
overload the proxies. There are two options available to mitigate this
|
||||
issue:
|
||||
Also, you need to decide whether you want to support object storage in
|
||||
your cloud. The two common use cases for providing object storage in a
|
||||
compute cloud are to provide:
|
||||
|
||||
* Deploy a large number of swift proxies. The drawback is that the
|
||||
proxies are not load-balanced and a large file request could
|
||||
continually hit the same proxy.
|
||||
* users with a persistent storage mechanism.
|
||||
* a scalable, reliable data store for virtual machine images.
|
||||
|
||||
* Add a caching HTTP proxy and load balancer in front of the swift
|
||||
proxies. Since swift objects are returned to the requester via HTTP,
|
||||
this load balancer alleviates the load required on the swift
|
||||
proxies.
|
||||
Selecting storage hardware
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Capacity planning for a compute-focused cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
.. TODO how to design (IOPS requirements [spinner vs SSD]/Read+Write/
|
||||
Availability/Migration)
|
||||
|
||||
Adding extra capacity to an compute-focused cloud is a horizontally scaling
|
||||
process.
|
||||
Storage hardware architecture is determined by selecting specific storage
|
||||
architecture. Determine the selection of storage architecture by
|
||||
evaluating possible solutions against the critical factors, the user
|
||||
requirements, technical considerations, and operational considerations.
|
||||
Consider the following factors when selecting storage hardware:
|
||||
|
||||
We recommend using similar CPUs when adding extra nodes to the environment.
|
||||
This reduces the chance of breaking live-migration features if they are
|
||||
present. Scaling out hypervisor hosts also has a direct effect on network
|
||||
and other data center resources. We recommend you factor in this increase
|
||||
when reaching rack capacity or when requiring extra network switches.
|
||||
Cost
|
||||
Storage can be a significant portion of the overall system cost. For
|
||||
an organization that is concerned with vendor support, a commercial
|
||||
storage solution is advisable, although it comes with a higher price
|
||||
tag. If initial capital expenditure requires minimization, designing
|
||||
a system based on commodity hardware would apply. The trade-off is
|
||||
potentially higher support costs and a greater risk of
|
||||
incompatibility and interoperability issues.
|
||||
|
||||
Changing the internal components of a Compute host to account for increases in
|
||||
demand is a process known as vertical scaling. Swapping a CPU for one with more
|
||||
cores, or increasing the memory in a server, can help add extra capacity for
|
||||
running applications.
|
||||
Performance
|
||||
The latency of storage I/O requests indicates performance. Performance
|
||||
requirements affect which solution you choose.
|
||||
|
||||
Another option is to assess the average workloads and increase the number of
|
||||
instances that can run within the compute environment by adjusting the
|
||||
overcommit ratio.
|
||||
Scalability
|
||||
Scalability, along with expandability, is a major consideration in a
|
||||
general purpose OpenStack cloud. It might be difficult to predict
|
||||
the final intended size of the implementation as there are no
|
||||
established usage patterns for a general purpose cloud. It might
|
||||
become necessary to expand the initial deployment in order to
|
||||
accommodate growth and user demand.
|
||||
|
||||
.. note::
|
||||
It is important to remember that changing the CPU overcommit ratio can
|
||||
have a detrimental effect and cause a potential increase in a noisy
|
||||
neighbor.
|
||||
Expandability
|
||||
Expandability is a major architecture factor for storage solutions
|
||||
with general purpose OpenStack cloud. A storage solution that
|
||||
expands to 50 PB is considered more expandable than a solution that
|
||||
only scales to 10 PB. This meter is related to scalability, which is
|
||||
the measure of a solution's performance as it expands.
|
||||
|
||||
The added risk of increasing the overcommit ratio is that more instances fail
|
||||
when a compute host fails. We do not recommend that you increase the CPU
|
||||
overcommit ratio in compute-focused OpenStack design architecture. It can
|
||||
increase the potential for noisy neighbor issues.
|
||||
|
||||
Capacity planning for a hybrid cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
One of the primary reasons many organizations use a hybrid cloud is to
|
||||
increase capacity without making large capital investments.
|
||||
|
||||
Capacity and the placement of workloads are key design considerations for
|
||||
hybrid clouds. The long-term capacity plan for these designs must incorporate
|
||||
growth over time to prevent permanent consumption of more expensive external
|
||||
clouds. To avoid this scenario, account for future applications’ capacity
|
||||
requirements and plan growth appropriately.
|
||||
|
||||
It is difficult to predict the amount of load a particular application might
|
||||
incur if the number of users fluctuate, or the application experiences an
|
||||
unexpected increase in use. It is possible to define application requirements
|
||||
in terms of vCPU, RAM, bandwidth, or other resources and plan appropriately.
|
||||
However, other clouds might not use the same meter or even the same
|
||||
oversubscription rates.
|
||||
|
||||
Oversubscription is a method to emulate more capacity than may physically be
|
||||
present. For example, a physical hypervisor node with 32 GB RAM may host 24
|
||||
instances, each provisioned with 2 GB RAM. As long as all 24 instances do not
|
||||
concurrently use 2 full gigabytes, this arrangement works well. However, some
|
||||
hosts take oversubscription to extremes and, as a result, performance can be
|
||||
inconsistent. If at all possible, determine what the oversubscription rates
|
||||
of each host are and plan capacity accordingly.
|
||||
|
||||
Block Storage
|
||||
~~~~~~~~~~~~~
|
||||
Implementing Block Storage
|
||||
--------------------------
|
||||
|
||||
Configure Block Storage resource nodes with advanced RAID controllers
|
||||
and high-performance disks to provide fault tolerance at the hardware
|
||||
@ -165,8 +171,8 @@ In environments that place substantial demands on Block Storage, we
|
||||
recommend using multiple storage pools. In this case, each pool of
|
||||
devices should have a similar hardware design and disk configuration
|
||||
across all hardware nodes in that pool. This allows for a design that
|
||||
provides applications with access to a wide variety of Block Storage
|
||||
pools, each with their own redundancy, availability, and performance
|
||||
provides applications with access to a wide variety of Block Storage pools,
|
||||
each with their own redundancy, availability, and performance
|
||||
characteristics. When deploying multiple pools of storage, it is also
|
||||
important to consider the impact on the Block Storage scheduler which is
|
||||
responsible for provisioning storage across resource nodes. Ideally,
|
||||
@ -184,8 +190,7 @@ API services to provide uninterrupted service. In some cases, it may
|
||||
also be necessary to deploy an additional layer of load balancing to
|
||||
provide access to back-end database services responsible for servicing
|
||||
and storing the state of Block Storage volumes. It is imperative that a
|
||||
highly available database cluster is used to store the Block
|
||||
Storage metadata.
|
||||
highly available database cluster is used to store the Block Storage metadata.
|
||||
|
||||
In a cloud with significant demands on Block Storage, the network
|
||||
architecture should take into account the amount of East-West bandwidth
|
||||
@ -194,49 +199,63 @@ The selected network devices should support jumbo frames for
|
||||
transferring large blocks of data, and utilize a dedicated network for
|
||||
providing connectivity between instances and Block Storage.
|
||||
|
||||
Scaling Block Storage
|
||||
---------------------
|
||||
|
||||
You can upgrade Block Storage pools to add storage capacity without
|
||||
interrupting the overall Block Storage service. Add nodes to the pool by
|
||||
installing and configuring the appropriate hardware and software and
|
||||
then allowing that node to report in to the proper storage pool through the
|
||||
message bus. Block Storage nodes generally report into the scheduler
|
||||
service advertising their availability. As a result, after the node is
|
||||
online and available, tenants can make use of those storage resources
|
||||
instantly.
|
||||
|
||||
In some cases, the demand on Block Storage may exhaust the available
|
||||
network bandwidth. As a result, design network infrastructure that
|
||||
services Block Storage resources in such a way that you can add capacity
|
||||
and bandwidth easily. This often involves the use of dynamic routing
|
||||
protocols or advanced networking solutions to add capacity to downstream
|
||||
devices easily. Both the front-end and back-end storage network designs
|
||||
should encompass the ability to quickly and easily add capacity and
|
||||
bandwidth.
|
||||
|
||||
.. note::
|
||||
|
||||
Sufficient monitoring and data collection should be in-place
|
||||
from the start, such that timely decisions regarding capacity,
|
||||
input/output metrics (IOPS) or storage-associated bandwidth can
|
||||
be made.
|
||||
|
||||
Object Storage
|
||||
~~~~~~~~~~~~~~
|
||||
Implementing Object Storage
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
While consistency and partition tolerance are both inherent features of
|
||||
the Object Storage service, it is important to design the overall
|
||||
storage architecture to ensure that the implemented system meets those
|
||||
goals. The OpenStack Object Storage service places a specific number of
|
||||
storage architecture to ensure that the implemented system meets those goals.
|
||||
The OpenStack Object Storage service places a specific number of
|
||||
data replicas as objects on resource nodes. Replicas are distributed
|
||||
throughout the cluster, based on a consistent hash ring also stored on
|
||||
each node in the cluster.
|
||||
|
||||
Design the Object Storage system with a sufficient number of zones to
|
||||
provide quorum for the number of replicas defined. For example, with
|
||||
three replicas configured in the swift cluster, the recommended number
|
||||
of zones to configure within the Object Storage cluster in order to
|
||||
When designing your cluster, you must consider durability and
|
||||
availability which is dependent on the spread and placement of your data,
|
||||
rather than the reliability of the hardware.
|
||||
|
||||
Consider the default value of the number of replicas, which is three. This
|
||||
means that before an object is marked as having been written, at least two
|
||||
copies exist in case a single server fails to write, the third copy may or
|
||||
may not yet exist when the write operation initially returns. Altering this
|
||||
number increases the robustness of your data, but reduces the amount of
|
||||
storage you have available. Look at the placement of your servers. Consider
|
||||
spreading them widely throughout your data center's network and power-failure
|
||||
zones. Is a zone a rack, a server, or a disk?
|
||||
|
||||
Consider these main traffic flows for an Object Storage network:
|
||||
|
||||
* Among :term:`object`, :term:`container`, and
|
||||
:term:`account servers <account server>`
|
||||
* Between servers and the proxies
|
||||
* Between the proxies and your users
|
||||
|
||||
Object Storage frequent communicates among servers hosting data. Even a small
|
||||
cluster generates megabytes per second of traffic.
|
||||
|
||||
Consider the scenario where an entire server fails and 24 TB of data
|
||||
needs to be transferred "immediately" to remain at three copies — this can
|
||||
put significant load on the network.
|
||||
|
||||
Another consideration is when a new file is being uploaded, the proxy server
|
||||
must write out as many streams as there are replicas, multiplying network
|
||||
traffic. For a three-replica cluster, 10 Gbps in means 30 Gbps out. Combining
|
||||
this with the previous high bandwidth bandwidth private versus public network
|
||||
recommendations demands of replication is what results in the recommendation
|
||||
that your private network be of significantly higher bandwidth than your public
|
||||
network requires. OpenStack Object Storage communicates internally with
|
||||
unencrypted, unauthenticated rsync for performance, so the private
|
||||
network is required.
|
||||
|
||||
The remaining point on bandwidth is the public-facing portion. The
|
||||
``swift-proxy`` service is stateless, which means that you can easily
|
||||
add more and use HTTP load-balancing methods to share bandwidth and
|
||||
availability between them. More proxies means more bandwidth.
|
||||
|
||||
You should consider designing the Object Storage system with a sufficient
|
||||
number of zones to provide quorum for the number of replicas defined. For
|
||||
example, with three replicas configured in the swift cluster, the recommended
|
||||
number of zones to configure within the Object Storage cluster in order to
|
||||
achieve quorum is five. While it is possible to deploy a solution with
|
||||
fewer zones, the implied risk of doing so is that some data may not be
|
||||
available and API requests to certain objects stored in the cluster
|
||||
@ -271,6 +290,45 @@ have different requirements with regards to replicas, retention, and
|
||||
other factors that could heavily affect the design of storage in a
|
||||
specific zone.
|
||||
|
||||
Planning and scaling storage capacity
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
An important consideration in running a cloud over time is projecting growth
|
||||
and utilization trends in order to plan capital expenditures for the short and
|
||||
long term. Gather utilization meters for compute, network, and storage, along
|
||||
with historical records of these meters. While securing major anchor tenants
|
||||
can lead to rapid jumps in the utilization of resources, the average rate of
|
||||
adoption of cloud services through normal usage also needs to be carefully
|
||||
monitored.
|
||||
|
||||
Scaling Block Storage
|
||||
---------------------
|
||||
|
||||
You can upgrade Block Storage pools to add storage capacity without
|
||||
interrupting the overall Block Storage service. Add nodes to the pool by
|
||||
installing and configuring the appropriate hardware and software and
|
||||
then allowing that node to report in to the proper storage pool through the
|
||||
message bus. Block Storage nodes generally report into the scheduler
|
||||
service advertising their availability. As a result, after the node is
|
||||
online and available, tenants can make use of those storage resources
|
||||
instantly.
|
||||
|
||||
In some cases, the demand on Block Storage may exhaust the available
|
||||
network bandwidth. As a result, design network infrastructure that
|
||||
services Block Storage resources in such a way that you can add capacity
|
||||
and bandwidth easily. This often involves the use of dynamic routing
|
||||
protocols or advanced networking solutions to add capacity to downstream
|
||||
devices easily. Both the front-end and back-end storage network designs
|
||||
should encompass the ability to quickly and easily add capacity and
|
||||
bandwidth.
|
||||
|
||||
.. note::
|
||||
|
||||
Sufficient monitoring and data collection should be in-place
|
||||
from the start, such that timely decisions regarding capacity,
|
||||
input/output metrics (IOPS) or storage-associated bandwidth can
|
||||
be made.
|
||||
|
||||
Scaling Object Storage
|
||||
----------------------
|
||||
|
||||
@ -312,3 +370,13 @@ resources servicing requests between proxy servers and storage nodes.
|
||||
For this reason, the network architecture used for access to storage
|
||||
nodes and proxy servers should make use of a design which is scalable.
|
||||
|
||||
|
||||
Redundancy
|
||||
----------
|
||||
|
||||
.. TODO
|
||||
|
||||
Replication
|
||||
-----------
|
||||
|
||||
.. TODO
|
@ -1,26 +0,0 @@
|
||||
=============
|
||||
Block Storage
|
||||
=============
|
||||
|
||||
Block storage also known as volume storage in OpenStack provides users
|
||||
with access to block storage devices. Users interact with block storage
|
||||
by attaching volumes to their running VM instances.
|
||||
|
||||
These volumes are persistent, they can be detached from one instance and
|
||||
re-attached to another and the data remains intact. Block storage is
|
||||
implemented in OpenStack by the OpenStack Block Storage (cinder), which
|
||||
supports multiple back ends in the form of drivers. Your
|
||||
choice of a storage back end must be supported by a Block Storage
|
||||
driver.
|
||||
|
||||
Most block storage drivers allow the instance to have direct access to
|
||||
the underlying storage hardware's block device. This helps increase the
|
||||
overall read/write IO. However, support for utilizing files as volumes
|
||||
is also well established, with full support for NFS, GlusterFS and
|
||||
others.
|
||||
|
||||
These drivers work a little differently than a traditional block
|
||||
storage driver. On an NFS or GlusterFS file system, a single file is
|
||||
created and then mapped as a virtual volume into the instance. This
|
||||
mapping/translation is similar to how OpenStack utilizes QEMU's
|
||||
file-based virtual machines stored in ``/var/lib/nova/instances``.
|
@ -1,134 +0,0 @@
|
||||
==============================
|
||||
Commodity storage technologies
|
||||
==============================
|
||||
|
||||
This section provides a high-level overview of the differences among the
|
||||
different commodity storage back end technologies. Depending on your
|
||||
cloud user's needs, you can implement one or many of these technologies
|
||||
in different combinations:
|
||||
|
||||
OpenStack Object Storage (swift)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Swift id the official OpenStack Object Store implementation. It is
|
||||
a mature technology that has been used for several years in production
|
||||
by a number of large cloud service providers. It is highly scalable
|
||||
and well-suited to managing petabytes of storage.
|
||||
|
||||
Swifts's advantages include better integration with OpenStack (integrates
|
||||
with OpenStack Identity, works with the OpenStack dashboard interface)
|
||||
and better support for distributed deployments across multiple datacenters
|
||||
through support for asynchronous eventual consistency replication.
|
||||
|
||||
Therefore, if you eventually plan on distributing your storage
|
||||
cluster across multiple data centers, if you need unified accounts
|
||||
for your users for both compute and object storage, or if you want
|
||||
to control your object storage with the OpenStack dashboard, you
|
||||
should consider OpenStack Object Storage. More detail can be found
|
||||
about OpenStack Object Storage in the section below.
|
||||
|
||||
Further information can be found on the `Swift Project page
|
||||
<https://www.openstack.org/software/releases/ocata/components/swift>`_.
|
||||
|
||||
Ceph
|
||||
~~~~
|
||||
|
||||
A scalable storage solution that replicates data across commodity
|
||||
storage nodes.
|
||||
|
||||
Ceph utilises and object storage mechanism for data storage and exposes
|
||||
the data via different types of storage interfaces to the end user it
|
||||
supports interfaces for:
|
||||
* Object storage
|
||||
* Block storage
|
||||
* File-system interfaces
|
||||
|
||||
Ceph provides support for the same Object Storage API as Swift and can
|
||||
be used as a back end for cinder block storage as well as back-end storage
|
||||
for glance images.
|
||||
|
||||
Ceph supports thin provisioning, implemented using copy-on-write. This can
|
||||
be useful when booting from volume because a new volume can be provisioned
|
||||
very quickly. Ceph also supports keystone-based authentication (as of
|
||||
version 0.56), so it can be a seamless swap in for the default OpenStack
|
||||
Swift implementation.
|
||||
|
||||
Ceph's advantages include:
|
||||
* provides the administrator more fine-grained
|
||||
control over data distribution and replication strategies
|
||||
* enables consolidation of object and block storage
|
||||
* enables very fast provisioning of boot-from-volume
|
||||
instances using thin provisioning
|
||||
* supports a distributed file-system interface,`CephFS <http://ceph.com/docs/master/cephfs/>`_
|
||||
|
||||
If you want to manage your object and block storage within a single
|
||||
system, or if you want to support fast boot-from-volume, you should
|
||||
consider Ceph.
|
||||
|
||||
Gluster
|
||||
~~~~~~~
|
||||
|
||||
A distributed shared file system. As of Gluster version 3.3, you
|
||||
can use Gluster to consolidate your object storage and file storage
|
||||
into one unified file and object storage solution, which is called
|
||||
Gluster For OpenStack (GFO). GFO uses a customized version of swift
|
||||
that enables Gluster to be used as the back-end storage.
|
||||
|
||||
The main reason to use GFO rather than swift is if you also
|
||||
want to support a distributed file system, either to support shared
|
||||
storage live migration or to provide it as a separate service to
|
||||
your end users. If you want to manage your object and file storage
|
||||
within a single system, you should consider GFO.
|
||||
|
||||
LVM
|
||||
~~~
|
||||
|
||||
The Logical Volume Manager (LVM) is a Linux-based system that provides an
|
||||
abstraction layer on top of physical disks to expose logical volumes
|
||||
to the operating system. The LVM back-end implements block storage
|
||||
as LVM logical partitions.
|
||||
|
||||
On each host that will house block storage, an administrator must
|
||||
initially create a volume group dedicated to Block Storage volumes.
|
||||
Blocks are created from LVM logical volumes.
|
||||
|
||||
.. note::
|
||||
|
||||
LVM does *not* provide any replication. Typically,
|
||||
administrators configure RAID on nodes that use LVM as block
|
||||
storage to protect against failures of individual hard drives.
|
||||
However, RAID does not protect against a failure of the entire
|
||||
host.
|
||||
|
||||
ZFS
|
||||
~~~
|
||||
|
||||
The Solaris iSCSI driver for OpenStack Block Storage implements
|
||||
blocks as ZFS entities. ZFS is a file system that also has the
|
||||
functionality of a volume manager. This is unlike on a Linux system,
|
||||
where there is a separation of volume manager (LVM) and file system
|
||||
(such as, ext3, ext4, xfs, and btrfs). ZFS has a number of
|
||||
advantages over ext4, including improved data-integrity checking.
|
||||
|
||||
The ZFS back end for OpenStack Block Storage supports only
|
||||
Solaris-based systems, such as Illumos. While there is a Linux port
|
||||
of ZFS, it is not included in any of the standard Linux
|
||||
distributions, and it has not been tested with OpenStack Block
|
||||
Storage. As with LVM, ZFS does not provide replication across hosts
|
||||
on its own, you need to add a replication solution on top of ZFS if
|
||||
your cloud needs to be able to handle storage-node failures.
|
||||
|
||||
|
||||
Sheepdog
|
||||
~~~~~~~~
|
||||
|
||||
Sheepdog is a userspace distributed storage system. Sheepdog scales
|
||||
to several hundred nodes, and has powerful virtual disk management
|
||||
features like snapshot, cloning, rollback and thin provisioning.
|
||||
|
||||
It is essentially an object storage system that manages disks and
|
||||
aggregates the space and performance of disks linearly in hyper
|
||||
scale on commodity hardware in a smart way. On top of its object store,
|
||||
Sheepdog provides elastic volume service and http service.
|
||||
Sheepdog does require a specific kernel version and can work
|
||||
nicely with xattr-supported file systems.
|
@ -1,47 +1,109 @@
|
||||
==============
|
||||
Storage design
|
||||
==============
|
||||
================
|
||||
Storage concepts
|
||||
================
|
||||
|
||||
Storage is found in many parts of the OpenStack cloud environment. This
|
||||
section describes persistent storage options you can configure with
|
||||
your cloud. It is important to understand the distinction between
|
||||
Storage is found in many parts of the OpenStack cloud environment. It is
|
||||
important to understand the distinction between
|
||||
:term:`ephemeral <ephemeral volume>` storage and
|
||||
:term:`persistent <persistent volume>` storage.
|
||||
:term:`persistent <persistent volume>` storage:
|
||||
|
||||
Ephemeral storage
|
||||
~~~~~~~~~~~~~~~~~
|
||||
- Ephemeral storage - If you only deploy OpenStack
|
||||
:term:`Compute service (nova)`, by default your users do not have access to
|
||||
any form of persistent storage. The disks associated with VMs are ephemeral,
|
||||
meaning that from the user's point of view they disappear when a virtual
|
||||
machine is terminated.
|
||||
|
||||
If you deploy only the OpenStack :term:`Compute service (nova)`, by
|
||||
default your users do not have access to any form of persistent storage. The
|
||||
disks associated with VMs are ephemeral, meaning that from the user's point
|
||||
of view they disappear when a virtual machine is terminated.
|
||||
- Persistent storage - Persistent storage means that the storage resource
|
||||
outlives any other resource and is always available, regardless of the state
|
||||
of a running instance.
|
||||
|
||||
Persistent storage
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Persistent storage means that the storage resource outlives any other
|
||||
resource and is always available, regardless of the state of a running
|
||||
instance.
|
||||
|
||||
Today, OpenStack clouds explicitly support three types of persistent
|
||||
storage: *Object Storage*, *Block Storage*, and *File-Based Storage*.
|
||||
OpenStack clouds explicitly support three types of persistent
|
||||
storage: *Object Storage*, *Block Storage*, and *File-based storage*.
|
||||
|
||||
Object storage
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
Object storage is implemented in OpenStack by the
|
||||
OpenStack Object Storage (swift) project. Users access binary objects
|
||||
through a REST API. If your intended users need to
|
||||
archive or manage large datasets, you want to provide them with Object
|
||||
Storage. In addition, OpenStack can store your virtual machine (VM)
|
||||
images inside of an object storage system, as an alternative to storing
|
||||
the images on a file system.
|
||||
Object Storage service (swift). Users access binary objects through a REST API.
|
||||
If your intended users need to archive or manage large datasets, you should
|
||||
provide them with Object Storage service. Additional benefits include:
|
||||
|
||||
OpenStack storage concepts
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
- OpenStack can store your virtual machine (VM) images inside of an Object
|
||||
Storage system, as an alternative to storing the images on a file system.
|
||||
- Integration with OpenStack Identity, and works with the OpenStack Dashboard.
|
||||
- Better support for distributed deployments across multiple datacenters
|
||||
through support for asynchronous eventual consistency replication.
|
||||
|
||||
:ref:`table_openstack_storage` explains the different storage concepts
|
||||
provided by OpenStack.
|
||||
You should consider using the OpenStack Object Storage service if you eventually
|
||||
plan on distributing your storage cluster across multiple data centers, if you
|
||||
need unified accounts for your users for both compute and object storage, or if
|
||||
you want to control your object storage with the OpenStack Dashboard. For more
|
||||
information, see the `Swift project page <https://www.openstack.org/software/releases/ocata/components/swift>`_.
|
||||
|
||||
Block storage
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
The Block Storage service (cinder) in OpenStacs. Because these volumes are
|
||||
persistent, they can be detached from one instance and re-attached to another
|
||||
instance and the data remains intact.
|
||||
|
||||
The Block Storage service supports multiple back ends in the form of drivers.
|
||||
Your choice of a storage back end must be supported by a block storage
|
||||
driver.
|
||||
|
||||
Most block storage drivers allow the instance to have direct access to
|
||||
the underlying storage hardware's block device. This helps increase the
|
||||
overall read/write IO. However, support for utilizing files as volumes
|
||||
is also well established, with full support for NFS, GlusterFS and
|
||||
others.
|
||||
|
||||
These drivers work a little differently than a traditional block
|
||||
storage driver. On an NFS or GlusterFS file system, a single file is
|
||||
created and then mapped as a virtual volume into the instance. This
|
||||
mapping and translation is similar to how OpenStack utilizes QEMU's
|
||||
file-based virtual machines stored in ``/var/lib/nova/instances``.
|
||||
|
||||
File-based storage
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In multi-tenant OpenStack cloud environment, the Shared File Systems service
|
||||
(manila) provides a set of services for management of shared file systems. The
|
||||
Shared File Systems service supports multiple back-ends in the form of drivers,
|
||||
and can be configured to provision shares from one or more back-ends. Share
|
||||
servers are virtual machines that export file shares using different file
|
||||
system protocols such as NFS, CIFS, GlusterFS, or HDFS.
|
||||
|
||||
The Shared File Systems service is persistent storage and can be mounted to any
|
||||
number of client machines. It can also be detached from one instance and
|
||||
attached to another instance without data loss. During this process the data
|
||||
are safe unless the Shared File Systems service itself is changed or removed.
|
||||
|
||||
Users interact with the Shared File Systems service by mounting remote file
|
||||
systems on their instances with the following usage of those systems for
|
||||
file storing and exchange. The Shared File Systems service provides shares
|
||||
which is a remote, mountable file system. You can mount a share and access a
|
||||
share from several hosts by several users at a time. With shares, you can also:
|
||||
|
||||
* Create a share specifying its size, shared file system protocol,
|
||||
visibility level.
|
||||
* Create a share on either a share server or standalone, depending on
|
||||
the selected back-end mode, with or without using a share network.
|
||||
* Specify access rules and security services for existing shares.
|
||||
* Combine several shares in groups to keep data consistency inside the
|
||||
groups for the following safe group operations.
|
||||
* Create a snapshot of a selected share or a share group for storing
|
||||
the existing shares consistently or creating new shares from that
|
||||
snapshot in a consistent way.
|
||||
* Create a share from a snapshot.
|
||||
* Set rate limits and quotas for specific shares and snapshots.
|
||||
* View usage of share resources.
|
||||
* Remove shares.
|
||||
|
||||
Differences between storage types
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
:ref:`table_openstack_storage` explains the differences between Openstack
|
||||
storage types.
|
||||
|
||||
.. _table_openstack_storage:
|
||||
|
||||
@ -54,7 +116,7 @@ provided by OpenStack.
|
||||
- Block storage
|
||||
- Object storage
|
||||
- Shared File System storage
|
||||
* - Used to…
|
||||
* - Application
|
||||
- Run operating system and scratch space
|
||||
- Add additional persistent storage to a virtual machine (VM)
|
||||
- Store data, including VM images
|
||||
@ -90,8 +152,8 @@ provided by OpenStack.
|
||||
* Requests for extension
|
||||
* Available user-level quotes
|
||||
* Limitations applied by Administrator
|
||||
* - Encryption set by…
|
||||
- Parameter in nova.conf
|
||||
* - Encryption configuration
|
||||
- Parameter in ``nova.conf``
|
||||
- Admin establishing `encrypted volume type
|
||||
<https://docs.openstack.org/admin-guide/dashboard-manage-volumes.html>`_,
|
||||
then user selecting encrypted volume
|
||||
@ -111,13 +173,13 @@ provided by OpenStack.
|
||||
|
||||
.. note::
|
||||
|
||||
**File-level Storage (for Live Migration)**
|
||||
**File-level storage for live migration**
|
||||
|
||||
With file-level storage, users access stored data using the operating
|
||||
system's file system interface. Most users, if they have used a network
|
||||
storage solution before, have encountered this form of networked
|
||||
storage. In the Unix world, the most common form of this is NFS. In the
|
||||
Windows world, the most common form is called CIFS (previously, SMB).
|
||||
system's file system interface. Most users who have used a network
|
||||
storage solution before have encountered this form of networked
|
||||
storage. The most common file system protocol for Unix is NFS, and for
|
||||
Windows, CIFS (previously, SMB).
|
||||
|
||||
OpenStack clouds do not present file-level storage to end users.
|
||||
However, it is important to consider file-level storage for storing
|
||||
@ -125,267 +187,121 @@ provided by OpenStack.
|
||||
since you must have a shared file system if you want to support live
|
||||
migration.
|
||||
|
||||
Choosing storage back ends
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Commodity storage technologies
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Users will indicate different needs for their cloud use cases. Some may
|
||||
need fast access to many objects that do not change often, or want to
|
||||
set a time-to-live (TTL) value on a file. Others may access only storage
|
||||
that is mounted with the file system itself, but want it to be
|
||||
replicated instantly when starting a new instance. For other systems,
|
||||
ephemeral storage is the preferred choice. When you select
|
||||
:term:`storage back ends <storage back end>`,
|
||||
consider the following questions from user's perspective:
|
||||
There are various commodity storage back end technologies available. Depending
|
||||
on your cloud user's needs, you can implement one or many of these technologies
|
||||
in different combinations.
|
||||
|
||||
* Do my users need block storage?
|
||||
* Do my users need object storage?
|
||||
* Do I need to support live migration?
|
||||
* Should my persistent storage drives be contained in my compute nodes,
|
||||
or should I use external storage?
|
||||
* What is the platter count I can achieve? Do more spindles result in
|
||||
better I/O despite network access?
|
||||
* Which one results in the best cost-performance scenario I'm aiming for?
|
||||
* How do I manage the storage operationally?
|
||||
* How redundant and distributed is the storage? What happens if a
|
||||
storage node fails? To what extent can it mitigate my data-loss
|
||||
disaster scenarios?
|
||||
Ceph
|
||||
----
|
||||
|
||||
To deploy your storage by using only commodity hardware, you can use a number
|
||||
of open-source packages, as shown in :ref:`table_persistent_file_storage`.
|
||||
Ceph is a scalable storage solution that replicates data across commodity
|
||||
storage nodes.
|
||||
|
||||
.. _table_persistent_file_storage:
|
||||
Ceph utilises and object storage mechanism for data storage and exposes
|
||||
the data via different types of storage interfaces to the end user it
|
||||
supports interfaces for:
|
||||
- Object storage
|
||||
- Block storage
|
||||
- File-system interfaces
|
||||
|
||||
.. list-table:: Table. Persistent file-based storage support
|
||||
:widths: 25 25 25 25
|
||||
:header-rows: 1
|
||||
Ceph provides support for the same Object Storage API as swift and can
|
||||
be used as a back end for the Block Storage service (cinder) as well as
|
||||
back-end storage for glance images.
|
||||
|
||||
* -
|
||||
- Object
|
||||
- Block
|
||||
- File-level
|
||||
* - Swift
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
-
|
||||
* - LVM
|
||||
-
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
* - Ceph
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- Experimental
|
||||
* - Gluster
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
* - NFS
|
||||
-
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
* - ZFS
|
||||
-
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
* - Sheepdog
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
Ceph supports thin provisioning implemented using copy-on-write. This can
|
||||
be useful when booting from volume because a new volume can be provisioned
|
||||
very quickly. Ceph also supports keystone-based authentication (as of
|
||||
version 0.56), so it can be a seamless swap in for the default OpenStack
|
||||
swift implementation.
|
||||
|
||||
This list of open source file-level shared storage solutions is not
|
||||
exhaustive other open source solutions exist (MooseFS). Your
|
||||
organization may already have deployed a file-level shared storage
|
||||
solution that you can use.
|
||||
Ceph's advantages include:
|
||||
|
||||
- The administrator has more fine-grained control over data distribution and
|
||||
replication strategies.
|
||||
- Consolidation of object storage and block storage.
|
||||
- Fast provisioning of boot-from-volume instances using thin provisioning.
|
||||
- Support for the distributed file-system interface
|
||||
`CephFS <http://ceph.com/docs/master/cephfs/>`_.
|
||||
|
||||
You should consider Ceph if you want to manage your object and block storage
|
||||
within a single system, or if you want to support fast boot-from-volume.
|
||||
|
||||
LVM
|
||||
---
|
||||
|
||||
The Logical Volume Manager (LVM) is a Linux-based system that provides an
|
||||
abstraction layer on top of physical disks to expose logical volumes
|
||||
to the operating system. The LVM back-end implements block storage
|
||||
as LVM logical partitions.
|
||||
|
||||
On each host that will house block storage, an administrator must
|
||||
initially create a volume group dedicated to Block Storage volumes.
|
||||
Blocks are created from LVM logical volumes.
|
||||
|
||||
.. note::
|
||||
|
||||
**Storage Driver Support**
|
||||
LVM does *not* provide any replication. Typically,
|
||||
administrators configure RAID on nodes that use LVM as block
|
||||
storage to protect against failures of individual hard drives.
|
||||
However, RAID does not protect against a failure of the entire
|
||||
host.
|
||||
|
||||
In addition to the open source technologies, there are a number of
|
||||
proprietary solutions that are officially supported by OpenStack Block
|
||||
Storage. You can find a matrix of the functionality provided by all of the
|
||||
supported Block Storage drivers on the `OpenStack
|
||||
wiki <https://wiki.openstack.org/wiki/CinderSupportMatrix>`_.
|
||||
ZFS
|
||||
---
|
||||
|
||||
Also, you need to decide whether you want to support object storage in
|
||||
your cloud. The two common use cases for providing object storage in a
|
||||
compute cloud are:
|
||||
The Solaris iSCSI driver for OpenStack Block Storage implements
|
||||
blocks as ZFS entities. ZFS is a file system that also has the
|
||||
functionality of a volume manager. This is unlike on a Linux system,
|
||||
where there is a separation of volume manager (LVM) and file system
|
||||
(such as, ext3, ext4, xfs, and btrfs). ZFS has a number of
|
||||
advantages over ext4, including improved data-integrity checking.
|
||||
|
||||
* To provide users with a persistent storage mechanism
|
||||
* As a scalable, reliable data store for virtual machine images
|
||||
The ZFS back end for OpenStack Block Storage supports only
|
||||
Solaris-based systems, such as Illumos. While there is a Linux port
|
||||
of ZFS, it is not included in any of the standard Linux
|
||||
distributions, and it has not been tested with OpenStack Block
|
||||
Storage. As with LVM, ZFS does not provide replication across hosts
|
||||
on its own, you need to add a replication solution on top of ZFS if
|
||||
your cloud needs to be able to handle storage-node failures.
|
||||
|
||||
Selecting storage hardware
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Gluster
|
||||
-------
|
||||
|
||||
Storage hardware architecture is determined by selecting specific storage
|
||||
architecture. Determine the selection of storage architecture by
|
||||
evaluating possible solutions against the critical factors, the user
|
||||
requirements, technical considerations, and operational considerations.
|
||||
Consider the following factors when selecting storage hardware:
|
||||
A distributed shared file system. As of Gluster version 3.3, you
|
||||
can use Gluster to consolidate your object storage and file storage
|
||||
into one unified file and object storage solution, which is called
|
||||
Gluster For OpenStack (GFO). GFO uses a customized version of swift
|
||||
that enables Gluster to be used as the back-end storage.
|
||||
|
||||
Cost
|
||||
Storage can be a significant portion of the overall system cost. For
|
||||
an organization that is concerned with vendor support, a commercial
|
||||
storage solution is advisable, although it comes with a higher price
|
||||
tag. If initial capital expenditure requires minimization, designing
|
||||
a system based on commodity hardware would apply. The trade-off is
|
||||
potentially higher support costs and a greater risk of
|
||||
incompatibility and interoperability issues.
|
||||
The main reason to use GFO rather than swift is if you also
|
||||
want to support a distributed file system, either to support shared
|
||||
storage live migration or to provide it as a separate service to
|
||||
your end users. If you want to manage your object and file storage
|
||||
within a single system, you should consider GFO.
|
||||
|
||||
Performance
|
||||
The latency of storage I/O requests indicates performance. Performance
|
||||
requirements affect which solution you choose.
|
||||
Sheepdog
|
||||
--------
|
||||
|
||||
Scalability
|
||||
Scalability, along with expandability, is a major consideration in a
|
||||
general purpose OpenStack cloud. It might be difficult to predict
|
||||
the final intended size of the implementation as there are no
|
||||
established usage patterns for a general purpose cloud. It might
|
||||
become necessary to expand the initial deployment in order to
|
||||
accommodate growth and user demand.
|
||||
Sheepdog is a userspace distributed storage system. Sheepdog scales
|
||||
to several hundred nodes, and has powerful virtual disk management
|
||||
features like snapshot, cloning, rollback and thin provisioning.
|
||||
|
||||
Expandability
|
||||
Expandability is a major architecture factor for storage solutions
|
||||
with general purpose OpenStack cloud. A storage solution that
|
||||
expands to 50 PB is considered more expandable than a solution that
|
||||
only scales to 10 PB. This meter is related to scalability, which is
|
||||
the measure of a solution's performance as it expands.
|
||||
It is essentially an object storage system that manages disks and
|
||||
aggregates the space and performance of disks linearly in hyper
|
||||
scale on commodity hardware in a smart way. On top of its object store,
|
||||
Sheepdog provides elastic volume service and http service.
|
||||
Sheepdog does require a specific kernel version and can work
|
||||
nicely with xattr-supported file systems.
|
||||
|
||||
General purpose cloud storage requirements
|
||||
------------------------------------------
|
||||
Using a scale-out storage solution with direct-attached storage (DAS) in
|
||||
the servers is well suited for a general purpose OpenStack cloud. Cloud
|
||||
services requirements determine your choice of scale-out solution. You
|
||||
need to determine if a single, highly expandable and highly vertical,
|
||||
scalable, centralized storage array is suitable for your design. After
|
||||
determining an approach, select the storage hardware based on this
|
||||
criteria.
|
||||
NFS
|
||||
---
|
||||
|
||||
This list expands upon the potential impacts for including a particular
|
||||
storage architecture (and corresponding storage hardware) into the
|
||||
design for a general purpose OpenStack cloud:
|
||||
.. TODO
|
||||
|
||||
Connectivity
|
||||
If storage protocols other than Ethernet are part of the storage solution,
|
||||
ensure the appropriate hardware has been selected. If a centralized storage
|
||||
array is selected, ensure that the hypervisor will be able to connect to
|
||||
that storage array for image storage.
|
||||
ISCSI
|
||||
-----
|
||||
|
||||
Usage
|
||||
How the particular storage architecture will be used is critical for
|
||||
determining the architecture. Some of the configurations that will
|
||||
influence the architecture include whether it will be used by the
|
||||
hypervisors for ephemeral instance storage, or if OpenStack Object
|
||||
Storage will use it for object storage.
|
||||
|
||||
Instance and image locations
|
||||
Where instances and images will be stored will influence the
|
||||
architecture.
|
||||
|
||||
Server hardware
|
||||
If the solution is a scale-out storage architecture that includes
|
||||
DAS, it will affect the server hardware selection. This could ripple
|
||||
into the decisions that affect host density, instance density, power
|
||||
density, OS-hypervisor, management tools and others.
|
||||
|
||||
A general purpose OpenStack cloud has multiple options. The key factors
|
||||
that will have an influence on selection of storage hardware for a
|
||||
general purpose OpenStack cloud are as follows:
|
||||
|
||||
Capacity
|
||||
Hardware resources selected for the resource nodes should be capable
|
||||
of supporting enough storage for the cloud services. Defining the
|
||||
initial requirements and ensuring the design can support adding
|
||||
capacity is important. Hardware nodes selected for object storage
|
||||
should be capable of support a large number of inexpensive disks
|
||||
with no reliance on RAID controller cards. Hardware nodes selected
|
||||
for block storage should be capable of supporting high speed storage
|
||||
solutions and RAID controller cards to provide performance and
|
||||
redundancy to storage at a hardware level. Selecting hardware RAID
|
||||
controllers that automatically repair damaged arrays will assist
|
||||
with the replacement and repair of degraded or deleted storage
|
||||
devices.
|
||||
|
||||
Performance
|
||||
Disks selected for object storage services do not need to be fast
|
||||
performing disks. We recommend that object storage nodes take
|
||||
advantage of the best cost per terabyte available for storage.
|
||||
Contrastingly, disks chosen for block storage services should take
|
||||
advantage of performance boosting features that may entail the use
|
||||
of SSDs or flash storage to provide high performance block storage
|
||||
pools. Storage performance of ephemeral disks used for instances
|
||||
should also be taken into consideration.
|
||||
|
||||
Fault tolerance
|
||||
Object storage resource nodes have no requirements for hardware
|
||||
fault tolerance or RAID controllers. It is not necessary to plan for
|
||||
fault tolerance within the object storage hardware because the
|
||||
object storage service provides replication between zones as a
|
||||
feature of the service. Block storage nodes, compute nodes, and
|
||||
cloud controllers should all have fault tolerance built in at the
|
||||
hardware level by making use of hardware RAID controllers and
|
||||
varying levels of RAID configuration. The level of RAID chosen
|
||||
should be consistent with the performance and availability
|
||||
requirements of the cloud.
|
||||
|
||||
Storage-focus cloud storage requirements
|
||||
----------------------------------------
|
||||
|
||||
Storage-focused OpenStack clouds must address I/O intensive workloads.
|
||||
These workloads are not CPU intensive, nor are they consistently network
|
||||
intensive. The network may be heavily utilized to transfer storage, but
|
||||
they are not otherwise network intensive.
|
||||
|
||||
The selection of storage hardware determines the overall performance and
|
||||
scalability of a storage-focused OpenStack design architecture. Several
|
||||
factors impact the design process, including:
|
||||
|
||||
Latency is a key consideration in a storage-focused OpenStack cloud.
|
||||
Using solid-state disks (SSDs) to minimize latency and, to reduce CPU
|
||||
delays caused by waiting for the storage, increases performance. Use
|
||||
RAID controller cards in compute hosts to improve the performance of the
|
||||
underlying disk subsystem.
|
||||
|
||||
Depending on the storage architecture, you can adopt a scale-out
|
||||
solution, or use a highly expandable and scalable centralized storage
|
||||
array. If a centralized storage array meets your requirements, then the
|
||||
array vendor determines the hardware selection. It is possible to build
|
||||
a storage array using commodity hardware with Open Source software, but
|
||||
requires people with expertise to build such a system.
|
||||
|
||||
On the other hand, a scale-out storage solution that uses
|
||||
direct-attached storage (DAS) in the servers may be an appropriate
|
||||
choice. This requires configuration of the server hardware to support
|
||||
the storage solution.
|
||||
|
||||
Considerations affecting storage architecture (and corresponding storage
|
||||
hardware) of a Storage-focused OpenStack cloud include:
|
||||
|
||||
Connectivity
|
||||
Ensure the connectivity matches the storage solution requirements. We
|
||||
recommended confirming that the network characteristics minimize latency
|
||||
to boost the overall performance of the design.
|
||||
|
||||
Latency
|
||||
Determine if the use case has consistent or highly variable latency.
|
||||
|
||||
Throughput
|
||||
Ensure that the storage solution throughput is optimized for your
|
||||
application requirements.
|
||||
|
||||
Server hardware
|
||||
Use of DAS impacts the server hardware choice and affects host
|
||||
density, instance density, power density, OS-hypervisor, and
|
||||
management tools.
|
||||
.. TODO
|
||||
|
@ -1,44 +0,0 @@
|
||||
===========================
|
||||
Shared File Systems service
|
||||
===========================
|
||||
|
||||
The Shared File Systems service (manila) provides a set of services for
|
||||
management of shared file systems in a multi-tenant cloud environment.
|
||||
Users interact with the Shared File Systems service by mounting remote File
|
||||
Systems on their instances with the following usage of those systems for
|
||||
file storing and exchange. The Shared File Systems service provides you with
|
||||
shares which is a remote, mountable file system. You can mount a
|
||||
share to and access a share from several hosts by several users at a
|
||||
time. With shares, user can also:
|
||||
|
||||
* Create a share specifying its size, shared file system protocol,
|
||||
visibility level.
|
||||
* Create a share on either a share server or standalone, depending on
|
||||
the selected back-end mode, with or without using a share network.
|
||||
* Specify access rules and security services for existing shares.
|
||||
* Combine several shares in groups to keep data consistency inside the
|
||||
groups for the following safe group operations.
|
||||
* Create a snapshot of a selected share or a share group for storing
|
||||
the existing shares consistently or creating new shares from that
|
||||
snapshot in a consistent way.
|
||||
* Create a share from a snapshot.
|
||||
* Set rate limits and quotas for specific shares and snapshots.
|
||||
* View usage of share resources.
|
||||
* Remove shares.
|
||||
|
||||
Like Block Storage, the Shared File Systems service is persistent. It
|
||||
can be:
|
||||
|
||||
* Mounted to any number of client machines.
|
||||
* Detached from one instance and attached to another without data loss.
|
||||
During this process the data are safe unless the Shared File Systems
|
||||
service itself is changed or removed.
|
||||
|
||||
Shares are provided by the Shared File Systems service. In OpenStack,
|
||||
Shared File Systems service is implemented by Shared File System
|
||||
(manila) project, which supports multiple back-ends in the form of
|
||||
drivers. The Shared File Systems service can be configured to provision
|
||||
shares from one or more back-ends. Share servers are, mostly, virtual
|
||||
machines that export file shares using different protocols such as NFS,
|
||||
CIFS, GlusterFS, or HDFS.
|
||||
|
@ -1,69 +0,0 @@
|
||||
==============
|
||||
Object Storage
|
||||
==============
|
||||
|
||||
Object Storage is implemented in OpenStack by the
|
||||
OpenStack Object Storage (swift) project. Users access binary objects
|
||||
through a REST API. If your intended users need to
|
||||
archive or manage large datasets, you want to provide them with Object
|
||||
Storage. In addition, OpenStack can store your virtual machine (VM)
|
||||
images inside of an object storage system, as an alternative to storing
|
||||
the images on a file system.
|
||||
|
||||
OpenStack Object Storage provides a highly scalable, highly available
|
||||
storage solution by relaxing some of the constraints of traditional file
|
||||
systems. In designing and procuring for such a cluster, it is important
|
||||
to understand some key concepts about its operation. Essentially, this
|
||||
type of storage is built on the idea that all storage hardware fails, at
|
||||
every level, at some point. Infrequently encountered failures that would
|
||||
hamstring other storage systems, such as issues taking down RAID cards
|
||||
or entire servers, are handled gracefully with OpenStack Object
|
||||
Storage. For more information, see the `Swift developer
|
||||
documentation <https://docs.openstack.org/developer/swift/overview_architecture.html>`_
|
||||
|
||||
When designing your cluster, you must consider durability and
|
||||
availability which is dependent on the spread and placement of your data,
|
||||
rather than the reliability of the hardware.
|
||||
|
||||
Consider the default value of the number of replicas, which is three. This
|
||||
means that before an object is marked as having been written, at least two
|
||||
copies exist in case a single server fails to write, the third copy may or
|
||||
may not yet exist when the write operation initially returns. Altering this
|
||||
number increases the robustness of your data, but reduces the amount of
|
||||
storage you have available. Look at the placement of your servers. Consider
|
||||
spreading them widely throughout your data center's network and power-failure
|
||||
zones. Is a zone a rack, a server, or a disk?
|
||||
|
||||
Consider these main traffic flows for an Object Storage network:
|
||||
|
||||
* Among :term:`object`, :term:`container`, and
|
||||
:term:`account servers <account server>`
|
||||
* Between servers and the proxies
|
||||
* Between the proxies and your users
|
||||
|
||||
Object Storage frequent communicates among servers hosting data. Even a small
|
||||
cluster generates megabytes per second of traffic, which is predominantly, “Do
|
||||
you have the object?” and “Yes I have the object!” If the answer
|
||||
to the question is negative or the request times out,
|
||||
replication of the object begins.
|
||||
|
||||
Consider the scenario where an entire server fails and 24 TB of data
|
||||
needs to be transferred "immediately" to remain at three copies — this can
|
||||
put significant load on the network.
|
||||
|
||||
Another consideration is when a new file is being uploaded, the proxy server
|
||||
must write out as many streams as there are replicas, multiplying network
|
||||
traffic. For a three-replica cluster, 10 Gbps in means 30 Gbps out. Combining
|
||||
this with the previous high bandwidth bandwidth private versus public network
|
||||
recommendations demands of replication is what results in the recommendation
|
||||
that your private network be of significantly higher bandwidth than your public
|
||||
network requires. OpenStack Object Storage communicates internally with
|
||||
unencrypted, unauthenticated rsync for performance, so the private
|
||||
network is required.
|
||||
|
||||
The remaining point on bandwidth is the public-facing portion. The
|
||||
``swift-proxy`` service is stateless, which means that you can easily
|
||||
add more and use HTTP load-balancing methods to share bandwidth and
|
||||
availability between them.
|
||||
|
||||
More proxies means more bandwidth, if your storage can keep up.
|
@ -92,6 +92,85 @@ weighed against current stability.
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
.. temporarily location of storage information until we establish a template
|
||||
|
||||
Storage requirements
|
||||
--------------------
|
||||
Using a scale-out storage solution with direct-attached storage (DAS) in
|
||||
the servers is well suited for a general purpose OpenStack cloud. Cloud
|
||||
services requirements determine your choice of scale-out solution. You
|
||||
need to determine if a single, highly expandable and highly vertical,
|
||||
scalable, centralized storage array is suitable for your design. After
|
||||
determining an approach, select the storage hardware based on this
|
||||
criteria.
|
||||
|
||||
This list expands upon the potential impacts for including a particular
|
||||
storage architecture (and corresponding storage hardware) into the
|
||||
design for a general purpose OpenStack cloud:
|
||||
|
||||
Connectivity
|
||||
If storage protocols other than Ethernet are part of the storage solution,
|
||||
ensure the appropriate hardware has been selected. If a centralized storage
|
||||
array is selected, ensure that the hypervisor will be able to connect to
|
||||
that storage array for image storage.
|
||||
|
||||
Usage
|
||||
How the particular storage architecture will be used is critical for
|
||||
determining the architecture. Some of the configurations that will
|
||||
influence the architecture include whether it will be used by the
|
||||
hypervisors for ephemeral instance storage, or if OpenStack Object
|
||||
Storage will use it for object storage.
|
||||
|
||||
Instance and image locations
|
||||
Where instances and images will be stored will influence the
|
||||
architecture.
|
||||
|
||||
Server hardware
|
||||
If the solution is a scale-out storage architecture that includes
|
||||
DAS, it will affect the server hardware selection. This could ripple
|
||||
into the decisions that affect host density, instance density, power
|
||||
density, OS-hypervisor, management tools and others.
|
||||
|
||||
A general purpose OpenStack cloud has multiple options. The key factors
|
||||
that will have an influence on selection of storage hardware for a
|
||||
general purpose OpenStack cloud are as follows:
|
||||
|
||||
Capacity
|
||||
Hardware resources selected for the resource nodes should be capable
|
||||
of supporting enough storage for the cloud services. Defining the
|
||||
initial requirements and ensuring the design can support adding
|
||||
capacity is important. Hardware nodes selected for object storage
|
||||
should be capable of support a large number of inexpensive disks
|
||||
with no reliance on RAID controller cards. Hardware nodes selected
|
||||
for block storage should be capable of supporting high speed storage
|
||||
solutions and RAID controller cards to provide performance and
|
||||
redundancy to storage at a hardware level. Selecting hardware RAID
|
||||
controllers that automatically repair damaged arrays will assist
|
||||
with the replacement and repair of degraded or deleted storage
|
||||
devices.
|
||||
|
||||
Performance
|
||||
Disks selected for object storage services do not need to be fast
|
||||
performing disks. We recommend that object storage nodes take
|
||||
advantage of the best cost per terabyte available for storage.
|
||||
Contrastingly, disks chosen for block storage services should take
|
||||
advantage of performance boosting features that may entail the use
|
||||
of SSDs or flash storage to provide high performance block storage
|
||||
pools. Storage performance of ephemeral disks used for instances
|
||||
should also be taken into consideration.
|
||||
|
||||
Fault tolerance
|
||||
Object storage resource nodes have no requirements for hardware
|
||||
fault tolerance or RAID controllers. It is not necessary to plan for
|
||||
fault tolerance within the object storage hardware because the
|
||||
object storage service provides replication between zones as a
|
||||
feature of the service. Block storage nodes, compute nodes, and
|
||||
cloud controllers should all have fault tolerance built in at the
|
||||
hardware level by making use of hardware RAID controllers and
|
||||
varying levels of RAID configuration. The level of RAID chosen
|
||||
should be consistent with the performance and availability
|
||||
requirements of the cloud.
|
||||
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -154,5 +154,57 @@ systems as an inline cache.
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Storage requirements
|
||||
--------------------
|
||||
|
||||
Storage-focused OpenStack clouds must address I/O intensive workloads.
|
||||
These workloads are not CPU intensive, nor are they consistently network
|
||||
intensive. The network may be heavily utilized to transfer storage, but
|
||||
they are not otherwise network intensive.
|
||||
|
||||
The selection of storage hardware determines the overall performance and
|
||||
scalability of a storage-focused OpenStack design architecture. Several
|
||||
factors impact the design process, including:
|
||||
|
||||
Latency
|
||||
A key consideration in a storage-focused OpenStack cloud is latency.
|
||||
Using solid-state disks (SSDs) to minimize latency and, to reduce CPU
|
||||
delays caused by waiting for the storage, increases performance. Use
|
||||
RAID controller cards in compute hosts to improve the performance of the
|
||||
underlying disk subsystem.
|
||||
|
||||
Scale-out solutions
|
||||
Depending on the storage architecture, you can adopt a scale-out
|
||||
solution, or use a highly expandable and scalable centralized storage
|
||||
array. If a centralized storage array meets your requirements, then the
|
||||
array vendor determines the hardware selection. It is possible to build
|
||||
a storage array using commodity hardware with Open Source software, but
|
||||
requires people with expertise to build such a system.
|
||||
|
||||
On the other hand, a scale-out storage solution that uses
|
||||
direct-attached storage (DAS) in the servers may be an appropriate
|
||||
choice. This requires configuration of the server hardware to support
|
||||
the storage solution.
|
||||
|
||||
Considerations affecting storage architecture (and corresponding storage
|
||||
hardware) of a Storage-focused OpenStack cloud include:
|
||||
|
||||
Connectivity
|
||||
Ensure the connectivity matches the storage solution requirements. We
|
||||
recommend confirming that the network characteristics minimize latency
|
||||
to boost the overall performance of the design.
|
||||
|
||||
Latency
|
||||
Determine if the use case has consistent or highly variable latency.
|
||||
|
||||
Throughput
|
||||
Ensure that the storage solution throughput is optimized for your
|
||||
application requirements.
|
||||
|
||||
Server hardware
|
||||
Use of DAS impacts the server hardware choice and affects host
|
||||
density, instance density, power density, OS-hypervisor, and
|
||||
management tools.
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
Loading…
Reference in New Issue
Block a user