Added Ceph as a storage type for live migration Change-Id: If246912246a65c4591c341c46fa16723ad0c3a44 Closes-Bug: #1724130
6.1 KiB
Instance storage solutions
As part of the architecture design for a compute cluster, you must specify storage for the disk on which the instantiated instance runs. There are three main approaches to providing temporary storage:
- Off compute node storage—shared file system
- On compute node storage—shared file system
- On compute node storage—nonshared file system
In general, the questions you should ask when selecting storage are as follows:
- What are my workloads?
- Do my workloads have IOPS requirements?
- Are there read, write, or random access performance requirements?
- What is my forecast for the scaling of storage for compute?
- What storage is my enterprise currently using? Can it be re-purposed?
- How do I manage the storage operationally?
Many operators use separate compute and storage hosts instead of a hyperconverged solution. Compute services and storage services have different requirements, and compute hosts typically require more CPU and RAM than storage hosts. Therefore, for a fixed budget, it makes sense to have different configurations for your compute nodes and your storage nodes. Compute nodes will be invested in CPU and RAM, and storage nodes will be invested in block storage.
However, if you are more restricted in the number of physical hosts you have available for creating your cloud and you want to be able to dedicate as many of your hosts as possible to running instances, it makes sense to run compute and storage on the same machines or use an existing storage array that is available.
The three main approaches to instance storage are provided in the next few sections.
Non-compute node based shared file system
In this option, the disks storing the running instances are hosted in servers outside of the compute nodes.
If you use separate compute and storage hosts, you can treat your compute hosts as "stateless". As long as you do not have any instances currently running on a compute host, you can take it offline or wipe it completely without having any effect on the rest of your cloud. This simplifies maintenance for the compute hosts.
There are several advantages to this approach:
- If a compute node fails, instances are usually easily recoverable.
- Running a dedicated storage system can be operationally simpler.
- You can scale to any number of spindles.
- It may be possible to share the external storage for other purposes.
The main disadvantages to this approach are:
- Depending on design, heavy I/O usage from some instances can affect unrelated instances.
- Use of the network can decrease performance.
- Scalability can be affected by network architecture.
On compute node storage—shared file system
In this option, each compute node is specified with a significant amount of disk space, but a distributed file system ties the disks from each compute node into a single mount.
The main advantage of this option is that it scales to external storage when you require additional storage.
However, this option has several disadvantages:
- Running a distributed file system can make you lose your data locality compared with nonshared storage.
- Recovery of instances is complicated by depending on multiple hosts.
- The chassis size of the compute node can limit the number of spindles able to be used in a compute node.
- Use of the network can decrease performance.
- Loss of compute nodes decreases storage availability for all hosts.
On compute node storage—nonshared file system
In this option, each compute node is specified with enough disks to store the instances it hosts.
There are two main advantages:
- Heavy I/O usage on one compute node does not affect instances on other compute nodes. Direct I/O access can increase performance.
- Each host can have different storage profiles for hosts aggregation and availability zones.
There are several disadvantages:
- If a compute node fails, the data associated with the instances running on that node is lost.
- The chassis size of the compute node can limit the number of spindles able to be used in a compute node.
- Migrations of instances from one node to another are more complicated and rely on features that may not continue to be developed.
- If additional storage is required, this option does not scale.
Running a shared file system on a storage system apart from the compute nodes is ideal for clouds where reliability and scalability are the most important factors. Running a shared file system on the compute nodes themselves may be best in a scenario where you have to deploy to pre-existing servers for which you have little to no control over their specifications or have specific storage performance needs but do not have a need for persistent storage.
Issues with live migration
Live migration is an integral part of the operations of the cloud. This feature provides the ability to seamlessly move instances from one physical host to another, a necessity for performing upgrades that require reboots of the compute hosts, but only works well with shared storage.
Live migration can also be done with non-shared storage, using a feature known as KVM live block migration. While an earlier implementation of block-based migration in KVM and QEMU was considered unreliable, there is a newer, more reliable implementation of block-based live migration as of the Mitaka release.
Live migration and block migration still have some issues:
- Error reporting has received some attention in Mitaka and Newton but there are improvements needed.
- Live migration resource tracking issues.
- Live migration of rescued images.
Choice of file system
If you want to support shared-storage live migration, you need to configure a distributed file system.
Possible options include:
- NFS (default for Linux)
- Ceph
- GlusterFS
- MooseFS
- Lustre
We recommend that you choose the option operators are most familiar with. NFS is the easiest to set up and there is extensive community knowledge about it.