Remove arch-design docs
The arch-design docs have not been maintained and the Ops Docs SIG plans to take ownership and maintain it out of its own repo. To avoid jobs overwriting the published content, this removes the docs from openstack-manuals. Depends-on: https://review.openstack.org/621012 Change-Id: I58acb6a5d25d8e0b02e5f3b068aebb4ec144bf1a Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
@ -1,27 +0,0 @@
|
||||
[metadata]
|
||||
name = architecturedesignguide
|
||||
summary = OpenStack Architecture Design Guide
|
||||
author = OpenStack
|
||||
author-email = openstack-dev@lists.openstack.org
|
||||
home-page = https://docs.openstack.org/
|
||||
classifier =
|
||||
Environment :: OpenStack
|
||||
Intended Audience :: Information Technology
|
||||
Intended Audience :: Cloud Architects
|
||||
License :: OSI Approved :: Apache Software License
|
||||
Operating System :: POSIX :: Linux
|
||||
Topic :: Documentation
|
||||
|
||||
[global]
|
||||
setup-hooks =
|
||||
pbr.hooks.setup_hook
|
||||
|
||||
[files]
|
||||
|
||||
[build_sphinx]
|
||||
warning-is-error = 1
|
||||
build-dir = build
|
||||
source-dir = source
|
||||
|
||||
[wheel]
|
||||
universal = 1
|
@ -1,30 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
# Copyright (c) 2013 Hewlett-Packard Development Company, L.P.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
# THIS FILE IS MANAGED BY THE GLOBAL REQUIREMENTS REPO - DO NOT EDIT
|
||||
import setuptools
|
||||
|
||||
# In python < 2.7.4, a lazy loading of package `pbr` will break
|
||||
# setuptools if some other modules registered functions in `atexit`.
|
||||
# solution from: http://bugs.python.org/issue15881#msg170215
|
||||
try:
|
||||
import multiprocessing # noqa
|
||||
except ImportError:
|
||||
pass
|
||||
|
||||
setuptools.setup(
|
||||
setup_requires=['pbr'],
|
||||
pbr=True)
|
@ -1,13 +0,0 @@
|
||||
=========================
|
||||
Architecture requirements
|
||||
=========================
|
||||
|
||||
This chapter describes the enterprise and operational factors that impacts the
|
||||
design of an OpenStack cloud.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
arch-requirements/arch-requirements-enterprise
|
||||
arch-requirements/arch-requirements-operations
|
||||
arch-requirements/arch-requirements-ha
|
@ -1,433 +0,0 @@
|
||||
=======================
|
||||
Enterprise requirements
|
||||
=======================
|
||||
|
||||
The following sections describe business, usage, and performance
|
||||
considerations for customers which will impact cloud architecture design.
|
||||
|
||||
Cost
|
||||
~~~~
|
||||
|
||||
Financial factors are a primary concern for any organization. Cost
|
||||
considerations may influence the type of cloud that you build.
|
||||
For example, a general purpose cloud is unlikely to be the most
|
||||
cost-effective environment for specialized applications.
|
||||
Unless business needs dictate that cost is a critical factor,
|
||||
cost should not be the sole consideration when choosing or designing a cloud.
|
||||
|
||||
As a general guideline, increasing the complexity of a cloud architecture
|
||||
increases the cost of building and maintaining it. For example, a hybrid or
|
||||
multi-site cloud architecture involving multiple vendors and technical
|
||||
architectures may require higher setup and operational costs because of the
|
||||
need for more sophisticated orchestration and brokerage tools than in other
|
||||
architectures. However, overall operational costs might be lower by virtue of
|
||||
using a cloud brokerage tool to deploy the workloads to the most cost effective
|
||||
platform.
|
||||
|
||||
.. TODO Replace examples with the proposed example use cases in this guide.
|
||||
|
||||
Consider the following costs categories when designing a cloud:
|
||||
|
||||
* Compute resources
|
||||
|
||||
* Networking resources
|
||||
|
||||
* Replication
|
||||
|
||||
* Storage
|
||||
|
||||
* Management
|
||||
|
||||
* Operational costs
|
||||
|
||||
It is also important to consider how costs will increase as your cloud scales.
|
||||
Choices that have a negligible impact in small systems may considerably
|
||||
increase costs in large systems. In these cases, it is important to minimize
|
||||
capital expenditure (CapEx) at all layers of the stack. Operators of massively
|
||||
scalable OpenStack clouds require the use of dependable commodity hardware and
|
||||
freely available open source software components to reduce deployment costs and
|
||||
operational expenses. Initiatives like Open Compute (more information available
|
||||
in the `Open Compute Project <http://www.opencompute.org>`_) provide additional
|
||||
information.
|
||||
|
||||
Time-to-market
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
The ability to deliver services or products within a flexible time
|
||||
frame is a common business factor when building a cloud. Allowing users to
|
||||
self-provision and gain access to compute, network, and
|
||||
storage resources on-demand may decrease time-to-market for new products
|
||||
and applications.
|
||||
|
||||
You must balance the time required to build a new cloud platform against the
|
||||
time saved by migrating users away from legacy platforms. In some cases,
|
||||
existing infrastructure may influence your architecture choices. For example,
|
||||
using multiple cloud platforms may be a good option when there is an existing
|
||||
investment in several applications, as it could be faster to tie the
|
||||
investments together rather than migrating the components and refactoring them
|
||||
to a single platform.
|
||||
|
||||
Revenue opportunity
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Revenue opportunities vary based on the intent and use case of the cloud.
|
||||
The requirements of a commercial, customer-facing product are often very
|
||||
different from an internal, private cloud. You must consider what features
|
||||
make your design most attractive to your users.
|
||||
|
||||
Capacity planning and scalability
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Capacity and the placement of workloads are key design considerations
|
||||
for clouds. A long-term capacity plan for these designs must
|
||||
incorporate growth over time to prevent permanent consumption of more
|
||||
expensive external clouds. To avoid this scenario, account for future
|
||||
applications' capacity requirements and plan growth appropriately.
|
||||
|
||||
It is difficult to predict the amount of load a particular
|
||||
application might incur if the number of users fluctuates, or the
|
||||
application experiences an unexpected increase in use.
|
||||
It is possible to define application requirements in terms of
|
||||
vCPU, RAM, bandwidth, or other resources and plan appropriately.
|
||||
However, other clouds might not use the same meter or even the same
|
||||
oversubscription rates.
|
||||
|
||||
Oversubscription is a method to emulate more capacity than
|
||||
may physically be present. For example, a physical hypervisor node with 32 GB
|
||||
RAM may host 24 instances, each provisioned with 2 GB RAM.
|
||||
As long as all 24 instances do not concurrently use 2 full
|
||||
gigabytes, this arrangement works well.
|
||||
However, some hosts take oversubscription to extremes and,
|
||||
as a result, performance can be inconsistent.
|
||||
If at all possible, determine what the oversubscription rates
|
||||
of each host are and plan capacity accordingly.
|
||||
|
||||
.. TODO Considerations when building your cloud, racks, CPUs, compute node
|
||||
density. For ongoing capacity planning refer to the Ops Guide.
|
||||
|
||||
|
||||
Performance
|
||||
~~~~~~~~~~~
|
||||
|
||||
Performance is a critical consideration when designing any cloud, and becomes
|
||||
increasingly important as size and complexity grow. While single-site, private
|
||||
clouds can be closely controlled, multi-site and hybrid deployments require
|
||||
more careful planning to reduce problems such as network latency between sites.
|
||||
|
||||
For example, you should consider the time required to
|
||||
run a workload in different clouds and methods for reducing this time.
|
||||
This may require moving data closer to applications or applications
|
||||
closer to the data they process, and grouping functionality so that
|
||||
connections that require low latency take place over a single cloud
|
||||
rather than spanning clouds.
|
||||
|
||||
This may also require a CMP that can determine which cloud can most
|
||||
efficiently run which types of workloads.
|
||||
|
||||
Using native OpenStack tools can help improve performance.
|
||||
For example, you can use Telemetry to measure performance and the
|
||||
Orchestration service (heat) to react to changes in demand.
|
||||
|
||||
.. note::
|
||||
|
||||
Orchestration requires special client configurations to integrate
|
||||
with Amazon Web Services. For other types of clouds, use CMP features.
|
||||
|
||||
Cloud resource deployment
|
||||
The cloud user expects repeatable, dependable, and deterministic processes
|
||||
for launching and deploying cloud resources. You could deliver this through
|
||||
a web-based interface or publicly available API endpoints. All appropriate
|
||||
options for requesting cloud resources must be available through some type
|
||||
of user interface, a command-line interface (CLI), or API endpoints.
|
||||
|
||||
Consumption model
|
||||
Cloud users expect a fully self-service and on-demand consumption model.
|
||||
When an OpenStack cloud reaches the massively scalable size, expect
|
||||
consumption as a service in each and every way.
|
||||
|
||||
* Everything must be capable of automation. For example, everything from
|
||||
compute hardware, storage hardware, networking hardware, to the installation
|
||||
and configuration of the supporting software. Manual processes are
|
||||
impractical in a massively scalable OpenStack design architecture.
|
||||
|
||||
* Massively scalable OpenStack clouds require extensive metering and
|
||||
monitoring functionality to maximize the operational efficiency by keeping
|
||||
the operator informed about the status and state of the infrastructure. This
|
||||
includes full scale metering of the hardware and software status. A
|
||||
corresponding framework of logging and alerting is also required to store
|
||||
and enable operations to act on the meters provided by the metering and
|
||||
monitoring solutions. The cloud operator also needs a solution that uses the
|
||||
data provided by the metering and monitoring solution to provide capacity
|
||||
planning and capacity trending analysis.
|
||||
|
||||
Location
|
||||
For many use cases the proximity of the user to their workloads has a
|
||||
direct influence on the performance of the application and therefore
|
||||
should be taken into consideration in the design. Certain applications
|
||||
require zero to minimal latency that can only be achieved by deploying
|
||||
the cloud in multiple locations. These locations could be in different
|
||||
data centers, cities, countries or geographical regions, depending on
|
||||
the user requirement and location of the users.
|
||||
|
||||
Input-Output requirements
|
||||
Input-Output performance requirements require researching and
|
||||
modeling before deciding on a final storage framework. Running
|
||||
benchmarks for Input-Output performance provides a baseline for
|
||||
expected performance levels. If these tests include details, then
|
||||
the resulting data can help model behavior and results during
|
||||
different workloads. Running scripted smaller benchmarks during the
|
||||
lifecycle of the architecture helps record the system health at
|
||||
different points in time. The data from these scripted benchmarks
|
||||
assist in future scoping and gaining a deeper understanding of an
|
||||
organization's needs.
|
||||
|
||||
Scale
|
||||
Scaling storage solutions in a storage-focused OpenStack
|
||||
architecture design is driven by initial requirements, including
|
||||
:term:`IOPS <Input/output Operations Per Second (IOPS)>`, capacity,
|
||||
bandwidth, and future needs. Planning capacity based on projected needs
|
||||
over the course of a budget cycle is important for a design. The
|
||||
architecture should balance cost and capacity, while also allowing
|
||||
flexibility to implement new technologies and methods as they become
|
||||
available.
|
||||
|
||||
Network
|
||||
~~~~~~~
|
||||
|
||||
It is important to consider the functionality, security, scalability,
|
||||
availability, and testability of the network when choosing a CMP and cloud
|
||||
provider.
|
||||
|
||||
* Decide on a network framework and design minimum functionality tests.
|
||||
This ensures testing and functionality persists during and after
|
||||
upgrades.
|
||||
* Scalability across multiple cloud providers may dictate which underlying
|
||||
network framework you choose in different cloud providers.
|
||||
It is important to present the network API functions and to verify
|
||||
that functionality persists across all cloud endpoints chosen.
|
||||
* High availability implementations vary in functionality and design.
|
||||
Examples of some common methods are active-hot-standby, active-passive,
|
||||
and active-active.
|
||||
Development of high availability and test frameworks is necessary to
|
||||
insure understanding of functionality and limitations.
|
||||
* Consider the security of data between the client and the endpoint,
|
||||
and of traffic that traverses the multiple clouds.
|
||||
|
||||
For example, degraded video streams and low quality VoIP sessions negatively
|
||||
impact user experience and may lead to productivity and economic loss.
|
||||
|
||||
Network misconfigurations
|
||||
Configuring incorrect IP addresses, VLANs, and routers can cause
|
||||
outages to areas of the network or, in the worst-case scenario, the
|
||||
entire cloud infrastructure. Automate network configurations to
|
||||
minimize the opportunity for operator error as it can cause
|
||||
disruptive problems.
|
||||
|
||||
Capacity planning
|
||||
Cloud networks require management for capacity and growth over time.
|
||||
Capacity planning includes the purchase of network circuits and
|
||||
hardware that can potentially have lead times measured in months or
|
||||
years.
|
||||
|
||||
Network tuning
|
||||
Configure cloud networks to minimize link loss, packet loss, packet
|
||||
storms, broadcast storms, and loops.
|
||||
|
||||
Single Point Of Failure (SPOF)
|
||||
Consider high availability at the physical and environmental layers.
|
||||
If there is a single point of failure due to only one upstream link,
|
||||
or only one power supply, an outage can become unavoidable.
|
||||
|
||||
Complexity
|
||||
An overly complex network design can be difficult to maintain and
|
||||
troubleshoot. While device-level configuration can ease maintenance
|
||||
concerns and automated tools can handle overlay networks, avoid or
|
||||
document non-traditional interconnects between functions and
|
||||
specialized hardware to prevent outages.
|
||||
|
||||
Non-standard features
|
||||
There are additional risks that arise from configuring the cloud
|
||||
network to take advantage of vendor specific features. One example
|
||||
is multi-link aggregation (MLAG) used to provide redundancy at the
|
||||
aggregator switch level of the network. MLAG is not a standard and,
|
||||
as a result, each vendor has their own proprietary implementation of
|
||||
the feature. MLAG architectures are not interoperable across switch
|
||||
vendors, which leads to vendor lock-in, and can cause delays or
|
||||
inability when upgrading components.
|
||||
|
||||
Dynamic resource expansion or bursting
|
||||
An application that requires additional resources may suit a multiple
|
||||
cloud architecture. For example, a retailer needs additional resources
|
||||
during the holiday season, but does not want to add private cloud
|
||||
resources to meet the peak demand.
|
||||
The user can accommodate the increased load by bursting to
|
||||
a public cloud for these peak load periods. These bursts could be
|
||||
for long or short cycles ranging from hourly to yearly.
|
||||
|
||||
Compliance and geo-location
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
An organization may have certain legal obligations and regulatory
|
||||
compliance measures which could require certain workloads or data to not
|
||||
be located in certain regions.
|
||||
|
||||
Compliance considerations are particularly important for multi-site clouds.
|
||||
Considerations include:
|
||||
|
||||
- federal legal requirements
|
||||
- local jurisdictional legal and compliance requirements
|
||||
- image consistency and availability
|
||||
- storage replication and availability (both block and file/object storage)
|
||||
- authentication, authorization, and auditing (AAA)
|
||||
|
||||
Geographical considerations may also impact the cost of building or leasing
|
||||
data centers. Considerations include:
|
||||
|
||||
- floor space
|
||||
- floor weight
|
||||
- rack height and type
|
||||
- environmental considerations
|
||||
- power usage and power usage efficiency (PUE)
|
||||
- physical security
|
||||
|
||||
Auditing
|
||||
~~~~~~~~
|
||||
|
||||
A well-considered auditing plan is essential for quickly finding issues.
|
||||
Keeping track of changes made to security groups and tenant changes can be
|
||||
useful in rolling back the changes if they affect production. For example,
|
||||
if all security group rules for a tenant disappeared, the ability to quickly
|
||||
track down the issue would be important for operational and legal reasons.
|
||||
For more details on auditing, see the `Compliance chapter
|
||||
<https://docs.openstack.org/security-guide/compliance.html>`_ in the OpenStack
|
||||
Security Guide.
|
||||
|
||||
Security
|
||||
~~~~~~~~
|
||||
|
||||
The importance of security varies based on the type of organization using
|
||||
a cloud. For example, government and financial institutions often have
|
||||
very high security requirements. Security should be implemented according to
|
||||
asset, threat, and vulnerability risk assessment matrices.
|
||||
See `security-requirements`.
|
||||
|
||||
Service level agreements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Service level agreements (SLA) must be developed in conjunction with business,
|
||||
technical, and legal input. Small, private clouds may operate under an informal
|
||||
SLA, but hybrid or public clouds generally require more formal agreements with
|
||||
their users.
|
||||
|
||||
For a user of a massively scalable OpenStack public cloud, there are no
|
||||
expectations for control over security, performance, or availability. Users
|
||||
expect only SLAs related to uptime of API services, and very basic SLAs for
|
||||
services offered. It is the user's responsibility to address these issues on
|
||||
their own. The exception to this expectation is the rare case of a massively
|
||||
scalable cloud infrastructure built for a private or government organization
|
||||
that has specific requirements.
|
||||
|
||||
High performance systems have SLA requirements for a minimum quality of service
|
||||
with regard to guaranteed uptime, latency, and bandwidth. The level of the
|
||||
SLA can have a significant impact on the network architecture and
|
||||
requirements for redundancy in the systems.
|
||||
|
||||
Hybrid cloud designs must accommodate differences in SLAs between providers,
|
||||
and consider their enforceability.
|
||||
|
||||
Application readiness
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Some applications are tolerant of a lack of synchronized object
|
||||
storage, while others may need those objects to be replicated and
|
||||
available across regions. Understanding how the cloud implementation
|
||||
impacts new and existing applications is important for risk mitigation,
|
||||
and the overall success of a cloud project. Applications may have to be
|
||||
written or rewritten for an infrastructure with little to no redundancy,
|
||||
or with the cloud in mind.
|
||||
|
||||
Application momentum
|
||||
Businesses with existing applications may find that it is
|
||||
more cost effective to integrate applications on multiple
|
||||
cloud platforms than migrating them to a single platform.
|
||||
|
||||
No predefined usage model
|
||||
The lack of a pre-defined usage model enables the user to run a wide
|
||||
variety of applications without having to know the application
|
||||
requirements in advance. This provides a degree of independence and
|
||||
flexibility that no other cloud scenarios are able to provide.
|
||||
|
||||
On-demand and self-service application
|
||||
By definition, a cloud provides end users with the ability to
|
||||
self-provision computing power, storage, networks, and software in a
|
||||
simple and flexible way. The user must be able to scale their
|
||||
resources up to a substantial level without disrupting the
|
||||
underlying host operations. One of the benefits of using a general
|
||||
purpose cloud architecture is the ability to start with limited
|
||||
resources and increase them over time as the user demand grows.
|
||||
|
||||
Authentication
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
It is recommended to have a single authentication domain rather than a
|
||||
separate implementation for each and every site. This requires an
|
||||
authentication mechanism that is highly available and distributed to
|
||||
ensure continuous operation. Authentication server locality might be
|
||||
required and should be planned for.
|
||||
|
||||
Migration, availability, site loss and recovery
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Outages can cause partial or full loss of site functionality. Strategies
|
||||
should be implemented to understand and plan for recovery scenarios.
|
||||
|
||||
* The deployed applications need to continue to function and, more
|
||||
importantly, you must consider the impact on the performance and
|
||||
reliability of the application when a site is unavailable.
|
||||
|
||||
* It is important to understand what happens to the replication of
|
||||
objects and data between the sites when a site goes down. If this
|
||||
causes queues to start building up, consider how long these queues
|
||||
can safely exist until an error occurs.
|
||||
|
||||
* After an outage, ensure the method for resuming proper operations of
|
||||
a site is implemented when it comes back online. We recommend you
|
||||
architect the recovery to avoid race conditions.
|
||||
|
||||
Disaster recovery and business continuity
|
||||
Cheaper storage makes the public cloud suitable for maintaining
|
||||
backup applications.
|
||||
|
||||
Migration scenarios
|
||||
Hybrid cloud architecture enables the migration of
|
||||
applications between different clouds.
|
||||
|
||||
Provider availability or implementation details
|
||||
Business changes can affect provider availability.
|
||||
Likewise, changes in a provider's service can disrupt
|
||||
a hybrid cloud environment or increase costs.
|
||||
|
||||
Provider API changes
|
||||
Consumers of external clouds rarely have control over provider
|
||||
changes to APIs, and changes can break compatibility.
|
||||
Using only the most common and basic APIs can minimize potential conflicts.
|
||||
|
||||
Image portability
|
||||
As of the Kilo release, there is no common image format that is
|
||||
usable by all clouds. Conversion or recreation of images is necessary
|
||||
if migrating between clouds. To simplify deployment, use the smallest
|
||||
and simplest images feasible, install only what is necessary, and
|
||||
use a deployment manager such as Chef or Puppet. Do not use golden
|
||||
images to speed up the process unless you repeatedly deploy the same
|
||||
images on the same cloud.
|
||||
|
||||
API differences
|
||||
Avoid using a hybrid cloud deployment with more than just
|
||||
OpenStack (or with different versions of OpenStack) as API changes
|
||||
can cause compatibility issues.
|
||||
|
||||
Business or technical diversity
|
||||
Organizations leveraging cloud-based services can embrace business
|
||||
diversity and utilize a hybrid cloud design to spread their
|
||||
workloads across multiple cloud providers. This ensures that
|
||||
no single cloud provider is the sole host for an application.
|
@ -1,182 +0,0 @@
|
||||
.. _high-availability:
|
||||
|
||||
=================
|
||||
High availability
|
||||
=================
|
||||
|
||||
Data plane and control plane
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When designing an OpenStack cloud, it is important to consider the needs
|
||||
dictated by the :term:`Service Level Agreement (SLA)`. This includes the core
|
||||
services required to maintain availability of running Compute service
|
||||
instances, networks, storage, and additional services running on top of those
|
||||
resources. These services are often referred to as the Data Plane services,
|
||||
and are generally expected to be available all the time.
|
||||
|
||||
The remaining services, responsible for create, read, update and delete (CRUD)
|
||||
operations, metering, monitoring, and so on, are often referred to as the
|
||||
Control Plane. The SLA is likely to dictate a lower uptime requirement for
|
||||
these services.
|
||||
|
||||
The services comprising an OpenStack cloud have a number of requirements that
|
||||
you need to understand in order to be able to meet SLA terms. For example, in
|
||||
order to provide the Compute service a minimum of storage, message queueing and
|
||||
database services are necessary as well as the networking between
|
||||
them.
|
||||
|
||||
Ongoing maintenance operations are made much simpler if there is logical and
|
||||
physical separation of Data Plane and Control Plane systems. It then becomes
|
||||
possible to, for example, reboot a controller without affecting customers.
|
||||
If one service failure affects the operation of an entire server (``noisy
|
||||
neighbor``), the separation between Control and Data Planes enables rapid
|
||||
maintenance with a limited effect on customer operations.
|
||||
|
||||
Eliminating single points of failure within each site
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
OpenStack lends itself to deployment in a highly available manner where it is
|
||||
expected that at least 2 servers be utilized. These can run all the services
|
||||
involved from the message queuing service, for example ``RabbitMQ`` or
|
||||
``QPID``, and an appropriately deployed database service such as ``MySQL`` or
|
||||
``MariaDB``. As services in the cloud are scaled out, back-end services will
|
||||
need to scale too. Monitoring and reporting on server utilization and response
|
||||
times, as well as load testing your systems, will help determine scale out
|
||||
decisions.
|
||||
|
||||
The OpenStack services themselves should be deployed across multiple servers
|
||||
that do not represent a single point of failure. Ensuring availability can
|
||||
be achieved by placing these services behind highly available load balancers
|
||||
that have multiple OpenStack servers as members.
|
||||
|
||||
There are a small number of OpenStack services which are intended to only run
|
||||
in one place at a time (for example, the ``ceilometer-agent-central`` service)
|
||||
. In order to prevent these services from becoming a single point of failure,
|
||||
they can be controlled by clustering software such as ``Pacemaker``.
|
||||
|
||||
In OpenStack, the infrastructure is integral to providing services and should
|
||||
always be available, especially when operating with SLAs. Ensuring network
|
||||
availability is accomplished by designing the network architecture so that no
|
||||
single point of failure exists. A consideration of the number of switches,
|
||||
routes and redundancies of power should be factored into core infrastructure,
|
||||
as well as the associated bonding of networks to provide diverse routes to your
|
||||
highly available switch infrastructure.
|
||||
|
||||
Care must be taken when deciding network functionality. Currently, OpenStack
|
||||
supports both the legacy networking (nova-network) system and the newer,
|
||||
extensible OpenStack Networking (neutron). OpenStack Networking and legacy
|
||||
networking both have their advantages and disadvantages. They are both valid
|
||||
and supported options that fit different network deployment models described in
|
||||
the `OpenStack Operations Guide
|
||||
<https://docs.openstack.org/ops-guide/arch_network_design.html#network-topology>`_.
|
||||
|
||||
When using the Networking service, the OpenStack controller servers or separate
|
||||
Networking hosts handle routing unless the dynamic virtual routers pattern for
|
||||
routing is selected. Running routing directly on the controller servers mixes
|
||||
the Data and Control Planes and can cause complex issues with performance and
|
||||
troubleshooting. It is possible to use third party software and external
|
||||
appliances that help maintain highly available layer three routes. Doing so
|
||||
allows for common application endpoints to control network hardware, or to
|
||||
provide complex multi-tier web applications in a secure manner. It is also
|
||||
possible to completely remove routing from Networking, and instead rely on
|
||||
hardware routing capabilities. In this case, the switching infrastructure must
|
||||
support layer three routing.
|
||||
|
||||
Application design must also be factored into the capabilities of the
|
||||
underlying cloud infrastructure. If the compute hosts do not provide a seamless
|
||||
live migration capability, then it must be expected that if a compute host
|
||||
fails, that instance and any data local to that instance will be deleted.
|
||||
However, when providing an expectation to users that instances have a
|
||||
high-level of uptime guaranteed, the infrastructure must be deployed in a way
|
||||
that eliminates any single point of failure if a compute host disappears.
|
||||
This may include utilizing shared file systems on enterprise storage or
|
||||
OpenStack Block storage to provide a level of guarantee to match service
|
||||
features.
|
||||
|
||||
If using a storage design that includes shared access to centralized storage,
|
||||
ensure that this is also designed without single points of failure and the SLA
|
||||
for the solution matches or exceeds the expected SLA for the Data Plane.
|
||||
|
||||
Eliminating single points of failure in a multi-region design
|
||||
-------------------------------------------------------------
|
||||
|
||||
Some services are commonly shared between multiple regions, including the
|
||||
Identity service and the Dashboard. In this case, it is necessary to ensure
|
||||
that the databases backing the services are replicated, and that access to
|
||||
multiple workers across each site can be maintained in the event of losing a
|
||||
single region.
|
||||
|
||||
Multiple network links should be deployed between sites to provide redundancy
|
||||
for all components. This includes storage replication, which should be isolated
|
||||
to a dedicated network or VLAN with the ability to assign QoS to control the
|
||||
replication traffic or provide priority for this traffic.
|
||||
|
||||
.. note::
|
||||
|
||||
If the data store is highly changeable, the network requirements could have
|
||||
a significant effect on the operational cost of maintaining the sites.
|
||||
|
||||
If the design incorporates more than one site, the ability to maintain object
|
||||
availability in both sites has significant implications on the Object Storage
|
||||
design and implementation. It also has a significant impact on the WAN network
|
||||
design between the sites.
|
||||
|
||||
If applications running in a cloud are not cloud-aware, there should be clear
|
||||
measures and expectations to define what the infrastructure can and cannot
|
||||
support. An example would be shared storage between sites. It is possible,
|
||||
however such a solution is not native to OpenStack and requires a third-party
|
||||
hardware vendor to fulfill such a requirement. Another example can be seen in
|
||||
applications that are able to consume resources in object storage directly.
|
||||
|
||||
Connecting more than two sites increases the challenges and adds more
|
||||
complexity to the design considerations. Multi-site implementations require
|
||||
planning to address the additional topology used for internal and external
|
||||
connectivity. Some options include full mesh topology, hub spoke, spine leaf,
|
||||
and 3D Torus.
|
||||
|
||||
For more information on high availability in OpenStack, see the `OpenStack High
|
||||
Availability Guide <https://docs.openstack.org/ha-guide/>`_.
|
||||
|
||||
Site loss and recovery
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Outages can cause partial or full loss of site functionality. Strategies
|
||||
should be implemented to understand and plan for recovery scenarios.
|
||||
|
||||
* The deployed applications need to continue to function and, more
|
||||
importantly, you must consider the impact on the performance and
|
||||
reliability of the application if a site is unavailable.
|
||||
|
||||
* It is important to understand what happens to the replication of
|
||||
objects and data between the sites when a site goes down. If this
|
||||
causes queues to start building up, consider how long these queues
|
||||
can safely exist until an error occurs.
|
||||
|
||||
* After an outage, ensure that operations of a site are resumed when it
|
||||
comes back online. We recommend that you architect the recovery to
|
||||
avoid race conditions.
|
||||
|
||||
|
||||
Replicating inter-site data
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Traditionally, replication has been the best method of protecting object store
|
||||
implementations. A variety of replication methods exist in storage
|
||||
architectures, for example synchronous and asynchronous mirroring. Most object
|
||||
stores and back-end storage systems implement methods for replication at the
|
||||
storage subsystem layer. Object stores also tailor replication techniques to
|
||||
fit a cloud's requirements.
|
||||
|
||||
Organizations must find the right balance between data integrity and data
|
||||
availability. Replication strategy may also influence disaster recovery
|
||||
methods.
|
||||
|
||||
Replication across different racks, data centers, and geographical regions
|
||||
increases focus on determining and ensuring data locality. The ability to
|
||||
guarantee data is accessed from the nearest or fastest storage can be necessary
|
||||
for applications to perform well.
|
||||
|
||||
.. note::
|
||||
|
||||
When running embedded object store methods, ensure that you do not
|
||||
instigate extra data replication as this may cause performance issues.
|
@ -1,259 +0,0 @@
|
||||
========================
|
||||
Operational requirements
|
||||
========================
|
||||
|
||||
This section describes operational factors affecting the design of an
|
||||
OpenStack cloud.
|
||||
|
||||
Network design
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
The network design for an OpenStack cluster includes decisions regarding
|
||||
the interconnect needs within the cluster, the need to allow clients to
|
||||
access their resources, and the access requirements for operators to
|
||||
administrate the cluster. You should consider the bandwidth, latency,
|
||||
and reliability of these networks.
|
||||
|
||||
Consider additional design decisions about monitoring and alarming.
|
||||
If you are using an external provider, service level agreements (SLAs)
|
||||
are typically defined in your contract. Operational considerations such
|
||||
as bandwidth, latency, and jitter can be part of the SLA.
|
||||
|
||||
As demand for network resources increase, make sure your network design
|
||||
accommodates expansion and upgrades. Operators add additional IP address
|
||||
blocks and add additional bandwidth capacity. In addition, consider
|
||||
managing hardware and software lifecycle events, for example upgrades,
|
||||
decommissioning, and outages, while avoiding service interruptions for
|
||||
tenants.
|
||||
|
||||
Factor maintainability into the overall network design. This includes
|
||||
the ability to manage and maintain IP addresses as well as the use of
|
||||
overlay identifiers including VLAN tag IDs, GRE tunnel IDs, and MPLS
|
||||
tags. As an example, if you may need to change all of the IP addresses
|
||||
on a network, a process known as renumbering, then the design must
|
||||
support this function.
|
||||
|
||||
Address network-focused applications when considering certain
|
||||
operational realities. For example, consider the impending exhaustion of
|
||||
IPv4 addresses, the migration to IPv6, and the use of private networks
|
||||
to segregate different types of traffic that an application receives or
|
||||
generates. In the case of IPv4 to IPv6 migrations, applications should
|
||||
follow best practices for storing IP addresses. We recommend you avoid
|
||||
relying on IPv4 features that did not carry over to the IPv6 protocol or
|
||||
have differences in implementation.
|
||||
|
||||
To segregate traffic, allow applications to create a private tenant
|
||||
network for database and storage network traffic. Use a public network
|
||||
for services that require direct client access from the Internet. Upon
|
||||
segregating the traffic, consider :term:`quality of service (QoS)` and
|
||||
security to ensure each network has the required level of service.
|
||||
|
||||
Also consider the routing of network traffic. For some applications,
|
||||
develop a complex policy framework for routing. To create a routing
|
||||
policy that satisfies business requirements, consider the economic cost
|
||||
of transmitting traffic over expensive links versus cheaper links, in
|
||||
addition to bandwidth, latency, and jitter requirements.
|
||||
|
||||
Finally, consider how to respond to network events. How load
|
||||
transfers from one link to another during a failure scenario could be
|
||||
a factor in the design. If you do not plan network capacity
|
||||
correctly, failover traffic could overwhelm other ports or network
|
||||
links and create a cascading failure scenario. In this case,
|
||||
traffic that fails over to one link overwhelms that link and then
|
||||
moves to the subsequent links until all network traffic stops.
|
||||
|
||||
SLA considerations
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Service-level agreements (SLAs) define the levels of availability that will
|
||||
impact the design of an OpenStack cloud to provide redundancy and high
|
||||
availability.
|
||||
|
||||
SLA terms that affect the design include:
|
||||
|
||||
* API availability guarantees implying multiple infrastructure services
|
||||
and highly available load balancers.
|
||||
|
||||
* Network uptime guarantees affecting switch design, which might
|
||||
require redundant switching and power.
|
||||
|
||||
* Networking security policy requirements.
|
||||
|
||||
In any environment larger than just a few hosts, there are two areas
|
||||
that might be subject to a SLA:
|
||||
|
||||
* Data Plane - services that provide virtualization, networking, and
|
||||
storage. Customers usually require these services to be continuously
|
||||
available.
|
||||
|
||||
* Control Plane - ancillary services such as API endpoints, and services that
|
||||
control CRUD operations. The services in this category are usually subject to
|
||||
a different SLA expectation and may be better suited on separate
|
||||
hardware or containers from the Data Plane services.
|
||||
|
||||
To effectively run cloud installations, initial downtime planning includes
|
||||
creating processes and architectures that support planned maintenance
|
||||
and unplanned system faults.
|
||||
|
||||
It is important to determine as part of the SLA negotiation which party is
|
||||
responsible for monitoring and starting up the Compute service instances if an
|
||||
outage occurs.
|
||||
|
||||
Upgrading, patching, and changing configuration items may require
|
||||
downtime for some services. Stopping services that form the Control Plane may
|
||||
not impact the Data Plane. Live-migration of Compute instances may be required
|
||||
to perform any actions that require downtime to Data Plane components.
|
||||
|
||||
There are many services outside the realms of pure OpenStack
|
||||
code which affects the ability of a cloud design to meet SLAs, including:
|
||||
|
||||
* Database services, such as ``MySQL`` or ``PostgreSQL``.
|
||||
* Services providing RPC, such as ``RabbitMQ``.
|
||||
* External network attachments.
|
||||
* Physical constraints such as power, rack space, network cabling, etc.
|
||||
* Shared storage including SAN based arrays, storage clusters such as ``Ceph``,
|
||||
and/or NFS services.
|
||||
|
||||
Depending on the design, some network service functions may fall into both the
|
||||
Control and Data Plane categories. For example, the neutron L3 Agent service
|
||||
may be considered a Control Plane component, but the routers themselves would
|
||||
be a Data Plane component.
|
||||
|
||||
In a design with multiple regions, the SLA would also need to take into
|
||||
consideration the use of shared services such as the Identity service
|
||||
and Dashboard.
|
||||
|
||||
Any SLA negotiation must also take into account the reliance on third parties
|
||||
for critical aspects of the design. For example, if there is an existing SLA
|
||||
on a component such as a storage system, the SLA must take into account this
|
||||
limitation. If the required SLA for the cloud exceeds the agreed uptime levels
|
||||
of the cloud components, additional redundancy would be required. This
|
||||
consideration is critical in a hybrid cloud design, where there are multiple
|
||||
third parties involved.
|
||||
|
||||
Support and maintenance
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
An operations staff supports, manages, and maintains an OpenStack environment.
|
||||
Their skills may be specialized or varied depending on the size and purpose of
|
||||
the installation.
|
||||
|
||||
The maintenance function of an operator should be taken into consideration:
|
||||
|
||||
Maintenance tasks
|
||||
Operating system patching, hardware/firmware upgrades, and datacenter
|
||||
related changes, as well as minor and release upgrades to OpenStack
|
||||
components are all ongoing operational tasks. The six monthly release
|
||||
cycle of the OpenStack projects needs to be considered as part of the
|
||||
cost of ongoing maintenance. The solution should take into account
|
||||
storage and network maintenance and the impact on underlying
|
||||
workloads.
|
||||
|
||||
Reliability and availability
|
||||
Reliability and availability depend on the many supporting components'
|
||||
availability and on the level of precautions taken by the service provider.
|
||||
This includes network, storage systems, datacenter, and operating systems.
|
||||
|
||||
For more information on
|
||||
managing and maintaining your OpenStack environment, see the
|
||||
`OpenStack Operations Guide <https://docs.openstack.org/operations-guide/>`_.
|
||||
|
||||
Logging and monitoring
|
||||
----------------------
|
||||
|
||||
OpenStack clouds require appropriate monitoring platforms to identify and
|
||||
manage errors.
|
||||
|
||||
.. note::
|
||||
|
||||
We recommend leveraging existing monitoring systems to see if they
|
||||
are able to effectively monitor an OpenStack environment.
|
||||
|
||||
Specific meters that are critically important to capture include:
|
||||
|
||||
* Image disk utilization
|
||||
|
||||
* Response time to the Compute API
|
||||
|
||||
Logging and monitoring does not significantly differ for a multi-site OpenStack
|
||||
cloud. The tools described in the `Logging and monitoring
|
||||
<https://docs.openstack.org/operations-guide/ops-logging-monitoring.html>`__ in
|
||||
the Operations Guide remain applicable. Logging and monitoring can be provided
|
||||
on a per-site basis, and in a common centralized location.
|
||||
|
||||
When attempting to deploy logging and monitoring facilities to a centralized
|
||||
location, care must be taken with the load placed on the inter-site networking
|
||||
links
|
||||
|
||||
Management software
|
||||
-------------------
|
||||
|
||||
Management software providing clustering, logging, monitoring, and alerting
|
||||
details for a cloud environment is often used. This impacts and affects the
|
||||
overall OpenStack cloud design, and must account for the additional resource
|
||||
consumption such as CPU, RAM, storage, and network
|
||||
bandwidth.
|
||||
|
||||
The inclusion of clustering software, such as Corosync or Pacemaker, is
|
||||
primarily determined by the availability of the cloud infrastructure and
|
||||
the complexity of supporting the configuration after it is deployed. The
|
||||
`OpenStack High Availability Guide <https://docs.openstack.org/ha-guide/>`_
|
||||
provides more details on the installation and configuration of Corosync
|
||||
and Pacemaker, should these packages need to be included in the design.
|
||||
|
||||
Some other potential design impacts include:
|
||||
|
||||
* OS-hypervisor combination
|
||||
Ensure that the selected logging, monitoring, or alerting tools support
|
||||
the proposed OS-hypervisor combination.
|
||||
|
||||
* Network hardware
|
||||
The network hardware selection needs to be supported by the logging,
|
||||
monitoring, and alerting software.
|
||||
|
||||
Database software
|
||||
-----------------
|
||||
|
||||
Most OpenStack components require access to back-end database services
|
||||
to store state and configuration information. Choose an appropriate
|
||||
back-end database which satisfies the availability and fault tolerance
|
||||
requirements of the OpenStack services.
|
||||
|
||||
MySQL is the default database for OpenStack, but other compatible
|
||||
databases are available.
|
||||
|
||||
.. note::
|
||||
|
||||
Telemetry uses MongoDB.
|
||||
|
||||
The chosen high availability database solution changes according to the
|
||||
selected database. MySQL, for example, provides several options. Use a
|
||||
replication technology such as Galera for active-active clustering. For
|
||||
active-passive use some form of shared storage. Each of these potential
|
||||
solutions has an impact on the design:
|
||||
|
||||
* Solutions that employ Galera/MariaDB require at least three MySQL
|
||||
nodes.
|
||||
|
||||
* MongoDB has its own design considerations for high availability.
|
||||
|
||||
* OpenStack design, generally, does not include shared storage.
|
||||
However, for some high availability designs, certain components might
|
||||
require it depending on the specific implementation.
|
||||
|
||||
Operator access to systems
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
There is a trend for cloud operations systems being hosted within the cloud
|
||||
environment. Operators require access to these systems to resolve a major
|
||||
incident.
|
||||
|
||||
Ensure that the network structure connects all clouds to form an integrated
|
||||
system. Also consider the state of handoffs which must be reliable and have
|
||||
minimal latency for optimal performance of the system.
|
||||
|
||||
If a significant portion of the cloud is on externally managed systems,
|
||||
prepare for situations where it may not be possible to make changes.
|
||||
Additionally, cloud providers may differ on how infrastructure must be managed
|
||||
and exposed. This can lead to delays in root cause analysis where a provider
|
||||
insists the blame lies with the other provider.
|
@ -1 +0,0 @@
|
||||
../../common
|
@ -1,307 +0,0 @@
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
# This file is execfile()d with the current directory set to its
|
||||
# containing dir.
|
||||
#
|
||||
# Note that not all possible configuration values are present in this
|
||||
# autogenerated file.
|
||||
#
|
||||
# All configuration values have a default; values that are commented out
|
||||
# serve to show the default.
|
||||
|
||||
import os
|
||||
# import sys
|
||||
|
||||
import openstackdocstheme
|
||||
|
||||
# If extensions (or modules to document with autodoc) are in another directory,
|
||||
# add these directories to sys.path here. If the directory is relative to the
|
||||
# documentation root, use os.path.abspath to make it absolute, like shown here.
|
||||
# sys.path.insert(0, os.path.abspath('.'))
|
||||
|
||||
|
||||
# -- General configuration ------------------------------------------------
|
||||
|
||||
# If your documentation needs a minimal Sphinx version, state it here.
|
||||
# needs_sphinx = '1.0'
|
||||
|
||||
# Add any Sphinx extension module names here, as strings. They can be
|
||||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
|
||||
# ones.
|
||||
extensions = ['openstackdocstheme']
|
||||
|
||||
# Add any paths that contain templates here, relative to this directory.
|
||||
# templates_path = ['_templates']
|
||||
|
||||
# The suffix of source filenames.
|
||||
source_suffix = '.rst'
|
||||
|
||||
# The encoding of source files.
|
||||
# source_encoding = 'utf-8-sig'
|
||||
|
||||
# The master toctree document.
|
||||
master_doc = 'index'
|
||||
|
||||
# General information about the project.
|
||||
repository_name = "openstack/openstack-manuals"
|
||||
bug_project = 'openstack-manuals'
|
||||
project = u'Architecture Design Guide'
|
||||
bug_tag = u'arch-design'
|
||||
copyright = u'2015-2018, OpenStack contributors'
|
||||
|
||||
# The version info for the project you're documenting, acts as replacement for
|
||||
# |version| and |release|, also used in various other places throughout the
|
||||
# built documents.
|
||||
#
|
||||
# The short X.Y version.
|
||||
version = ''
|
||||
# The full version, including alpha/beta/rc tags.
|
||||
release = ''
|
||||
|
||||
# The language for content autogenerated by Sphinx. Refer to documentation
|
||||
# for a list of supported languages.
|
||||
# language = None
|
||||
|
||||
# There are two options for replacing |today|: either, you set today to some
|
||||
# non-false value, then it is used:
|
||||
# today = ''
|
||||
# Else, today_fmt is used as the format for a strftime call.
|
||||
# today_fmt = '%B %d, %Y'
|
||||
|
||||
# List of patterns, relative to source directory, that match files and
|
||||
# directories to ignore when looking for source files.
|
||||
exclude_patterns = ['common/cli*', 'common/nova*', 'common/get-started-*']
|
||||
|
||||
# The reST default role (used for this markup: `text`) to use for all
|
||||
# documents.
|
||||
# default_role = None
|
||||
|
||||
# If true, '()' will be appended to :func: etc. cross-reference text.
|
||||
# add_function_parentheses = True
|
||||
|
||||
# If true, the current module name will be prepended to all description
|
||||
# unit titles (such as .. function::).
|
||||
# add_module_names = True
|
||||
|
||||
# If true, sectionauthor and moduleauthor directives will be shown in the
|
||||
# output. They are ignored by default.
|
||||
# show_authors = False
|
||||
|
||||
# The name of the Pygments (syntax highlighting) style to use.
|
||||
pygments_style = 'sphinx'
|
||||
|
||||
# A list of ignored prefixes for module index sorting.
|
||||
# modindex_common_prefix = []
|
||||
|
||||
# If true, keep warnings as "system message" paragraphs in the built documents.
|
||||
# keep_warnings = False
|
||||
|
||||
|
||||
# -- Options for HTML output ----------------------------------------------
|
||||
|
||||
# The theme to use for HTML and HTML Help pages. See the documentation for
|
||||
# a list of builtin themes.
|
||||
html_theme = 'openstackdocs'
|
||||
|
||||
# Theme options are theme-specific and customize the look and feel of a theme
|
||||
# further. For a list of options available for each theme, see the
|
||||
# documentation.
|
||||
html_theme_options = {
|
||||
'display_badge': False
|
||||
}
|
||||
|
||||
# Add any paths that contain custom themes here, relative to this directory.
|
||||
# html_theme_path = [openstackdocstheme.get_html_theme_path()]
|
||||
|
||||
# The name for this set of Sphinx documents. If None, it defaults to
|
||||
# "<project> v<release> documentation".
|
||||
# html_title = None
|
||||
|
||||
# A shorter title for the navigation bar. Default is the same as html_title.
|
||||
# html_short_title = None
|
||||
|
||||
# The name of an image file (relative to this directory) to place at the top
|
||||
# of the sidebar.
|
||||
# html_logo = None
|
||||
|
||||
# The name of an image file (within the static path) to use as favicon of the
|
||||
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
|
||||
# pixels large.
|
||||
# html_favicon = None
|
||||
|
||||
# Add any paths that contain custom static files (such as style sheets) here,
|
||||
# relative to this directory. They are copied after the builtin static files,
|
||||
# so a file named "default.css" will overwrite the builtin "default.css".
|
||||
# html_static_path = []
|
||||
|
||||
# Add any extra paths that contain custom files (such as robots.txt or
|
||||
# .htaccess) here, relative to this directory. These files are copied
|
||||
# directly to the root of the documentation.
|
||||
# html_extra_path = []
|
||||
|
||||
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
|
||||
# using the given strftime format.
|
||||
# So that we can enable "log-a-bug" links from each output HTML page, this
|
||||
# variable must be set to a format that includes year, month, day, hours and
|
||||
# minutes.
|
||||
html_last_updated_fmt = '%Y-%m-%d %H:%M'
|
||||
|
||||
# If true, SmartyPants will be used to convert quotes and dashes to
|
||||
# typographically correct entities.
|
||||
# html_use_smartypants = True
|
||||
|
||||
# Custom sidebar templates, maps document names to template names.
|
||||
# html_sidebars = {}
|
||||
|
||||
# Additional templates that should be rendered to pages, maps page names to
|
||||
# template names.
|
||||
# html_additional_pages = {}
|
||||
|
||||
# If false, no module index is generated.
|
||||
# html_domain_indices = True
|
||||
|
||||
# If false, no index is generated.
|
||||
html_use_index = False
|
||||
|
||||
# If true, the index is split into individual pages for each letter.
|
||||
# html_split_index = False
|
||||
|
||||
# If true, links to the reST sources are added to the pages.
|
||||
html_show_sourcelink = False
|
||||
|
||||
# If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
|
||||
# html_show_sphinx = True
|
||||
|
||||
# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
|
||||
# html_show_copyright = True
|
||||
|
||||
# If true, an OpenSearch description file will be output, and all pages will
|
||||
# contain a <link> tag referring to it. The value of this option must be the
|
||||
# base URL from which the finished HTML is served.
|
||||
# html_use_opensearch = ''
|
||||
|
||||
# This is the file name suffix for HTML files (e.g. ".xhtml").
|
||||
# html_file_suffix = None
|
||||
|
||||
# Output file base name for HTML help builder.
|
||||
htmlhelp_basename = 'arch-design'
|
||||
|
||||
# If true, publish source files
|
||||
html_copy_source = False
|
||||
|
||||
# -- Options for LaTeX output ---------------------------------------------
|
||||
pdf_theme_path = openstackdocstheme.get_pdf_theme_path()
|
||||
openstack_logo = openstackdocstheme.get_openstack_logo_path()
|
||||
|
||||
latex_custom_template = r"""
|
||||
\newcommand{\openstacklogo}{%s}
|
||||
\usepackage{%s}
|
||||
""" % (openstack_logo, pdf_theme_path)
|
||||
|
||||
latex_engine = 'xelatex'
|
||||
|
||||
latex_elements = {
|
||||
# The paper size ('letterpaper' or 'a4paper').
|
||||
'papersize': 'a4paper',
|
||||
|
||||
# The font size ('10pt', '11pt' or '12pt').
|
||||
'pointsize': '11pt',
|
||||
|
||||
#Default figure align
|
||||
'figure_align': 'H',
|
||||
|
||||
# Not to generate blank page after chapter
|
||||
'classoptions': ',openany',
|
||||
|
||||
# Additional stuff for the LaTeX preamble.
|
||||
'preamble': latex_custom_template,
|
||||
}
|
||||
|
||||
# Grouping the document tree into LaTeX files. List of tuples
|
||||
# (source start file, target name, title,
|
||||
# author, documentclass [howto, manual, or own class]).
|
||||
latex_documents = [
|
||||
('index', 'ArchGuide.tex', u'Architecture Design Guide',
|
||||
u'OpenStack contributors', 'manual'),
|
||||
]
|
||||
|
||||
# The name of an image file (relative to this directory) to place at the top of
|
||||
# the title page.
|
||||
# latex_logo = None
|
||||
|
||||
# For "manual" documents, if this is true, then toplevel headings are parts,
|
||||
# not chapters.
|
||||
# latex_use_parts = False
|
||||
|
||||
# If true, show page references after internal links.
|
||||
# latex_show_pagerefs = False
|
||||
|
||||
# If true, show URL addresses after external links.
|
||||
# latex_show_urls = False
|
||||
|
||||
# Documents to append as an appendix to all manuals.
|
||||
# latex_appendices = []
|
||||
|
||||
# If false, no module index is generated.
|
||||
# latex_domain_indices = True
|
||||
|
||||
|
||||
# -- Options for manual page output ---------------------------------------
|
||||
|
||||
# One entry per manual page. List of tuples
|
||||
# (source start file, name, description, authors, manual section).
|
||||
man_pages = [
|
||||
('index', 'ArchDesign', u'Architecture Design Guide',
|
||||
[u'OpenStack contributors'], 1)
|
||||
]
|
||||
|
||||
# If true, show URL addresses after external links.
|
||||
# man_show_urls = False
|
||||
|
||||
|
||||
# -- Options for Texinfo output -------------------------------------------
|
||||
|
||||
# Grouping the document tree into Texinfo files. List of tuples
|
||||
# (source start file, target name, title, author,
|
||||
# dir menu entry, description, category)
|
||||
texinfo_documents = [
|
||||
('index', 'ArchDesign', u'Architecture Design Guide',
|
||||
u'OpenStack contributors', 'ArchDesign',
|
||||
'To reap the benefits of OpenStack, you should plan, design,'
|
||||
'and architect your cloud properly, taking user needs into'
|
||||
'account and understanding the use cases.'
|
||||
'commands.', 'Miscellaneous'),
|
||||
]
|
||||
|
||||
# Documents to append as an appendix to all manuals.
|
||||
# texinfo_appendices = []
|
||||
|
||||
# If false, no module index is generated.
|
||||
# texinfo_domain_indices = True
|
||||
|
||||
# How to display URL addresses: 'footnote', 'no', or 'inline'.
|
||||
# texinfo_show_urls = 'footnote'
|
||||
|
||||
# If true, do not generate a @detailmenu in the "Top" node's menu.
|
||||
# texinfo_no_detailmenu = False
|
||||
|
||||
# -- Options for Internationalization output ------------------------------
|
||||
locale_dirs = ['locale/']
|
||||
|
||||
# -- Options for PDF output --------------------------------------------------
|
||||
|
||||
pdf_documents = [
|
||||
('index', u'ArchDesignGuide', u'Architecture Design Guide',
|
||||
u'OpenStack contributors')
|
||||
]
|
@ -1,49 +0,0 @@
|
||||
=============================
|
||||
Cloud management architecture
|
||||
=============================
|
||||
|
||||
Complex clouds, in particular hybrid clouds, may require tools to
|
||||
facilitate working across multiple clouds.
|
||||
|
||||
Broker between clouds
|
||||
Brokering software evaluates relative costs between different
|
||||
cloud platforms. Cloud Management Platforms (CMP)
|
||||
allow the designer to determine the right location for the
|
||||
workload based on predetermined criteria.
|
||||
|
||||
Facilitate orchestration across the clouds
|
||||
CMPs simplify the migration of application workloads between
|
||||
public, private, and hybrid cloud platforms.
|
||||
|
||||
We recommend using cloud orchestration tools for managing a diverse
|
||||
portfolio of systems and applications across multiple cloud platforms.
|
||||
|
||||
Technical details
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. TODO
|
||||
|
||||
Capacity and scale
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. TODO
|
||||
|
||||
High availability
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. TODO
|
||||
|
||||
Operator requirements
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. TODO
|
||||
|
||||
Deployment considerations
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. TODO
|
||||
|
||||
Maintenance considerations
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. TODO
|
@ -1,20 +0,0 @@
|
||||
====================
|
||||
Compute architecture
|
||||
====================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 3
|
||||
|
||||
design-compute/design-compute-arch
|
||||
design-compute/design-compute-cpu
|
||||
design-compute/design-compute-hypervisor
|
||||
design-compute/design-compute-hardware
|
||||
design-compute/design-compute-overcommit
|
||||
design-compute/design-compute-storage
|
||||
design-compute/design-compute-networking
|
||||
design-compute/design-compute-logging
|
||||
|
||||
This section describes some of the choices you need to consider
|
||||
when designing and building your compute nodes. Compute nodes form the
|
||||
resource core of the OpenStack Compute cloud, providing the processing, memory,
|
||||
network and storage resources to run instances.
|
@ -1,104 +0,0 @@
|
||||
====================================
|
||||
Compute server architecture overview
|
||||
====================================
|
||||
|
||||
When designing compute resource pools, consider the number of processors,
|
||||
amount of memory, network requirements, the quantity of storage required for
|
||||
each hypervisor, and any requirements for bare metal hosts provisioned
|
||||
through ironic.
|
||||
|
||||
When architecting an OpenStack cloud, as part of the planning process, you
|
||||
must not only determine what hardware to utilize but whether compute
|
||||
resources will be provided in a single pool or in multiple pools or
|
||||
availability zones. You should consider if the cloud will provide distinctly
|
||||
different profiles for compute.
|
||||
|
||||
For example, CPU, memory or local storage based compute nodes. For NFV
|
||||
or HPC based clouds, there may even be specific network configurations that
|
||||
should be reserved for those specific workloads on specific compute nodes. This
|
||||
method of designing specific resources into groups or zones of compute can be
|
||||
referred to as bin packing.
|
||||
|
||||
.. note::
|
||||
|
||||
In a bin packing design, each independent resource pool provides service for
|
||||
specific flavors. Since instances are scheduled onto compute hypervisors,
|
||||
each independent node's resources will be allocated to efficiently use the
|
||||
available hardware. While bin packing can separate workload specific
|
||||
resources onto individual servers, bin packing also requires a common
|
||||
hardware design, with all hardware nodes within a compute resource pool
|
||||
sharing a common processor, memory, and storage layout. This makes it easier
|
||||
to deploy, support, and maintain nodes throughout their lifecycle.
|
||||
|
||||
Increasing the size of the supporting compute environment increases the network
|
||||
traffic and messages, adding load to the controllers and administrative
|
||||
services used to support the OpenStack cloud or networking nodes. When
|
||||
considering hardware for controller nodes, whether using the monolithic
|
||||
controller design, where all of the controller services live on one or more
|
||||
physical hardware nodes, or in any of the newer shared nothing control plane
|
||||
models, adequate resources must be allocated and scaled to meet scale
|
||||
requirements. Effective monitoring of the environment will help with capacity
|
||||
decisions on scaling. Proper planning will help avoid bottlenecks and network
|
||||
oversubscription as the cloud scales.
|
||||
|
||||
Compute nodes automatically attach to OpenStack clouds, resulting in a
|
||||
horizontally scaling process when adding extra compute capacity to an
|
||||
OpenStack cloud. To further group compute nodes and place nodes into
|
||||
appropriate availability zones and host aggregates, additional work is
|
||||
required. It is necessary to plan rack capacity and network switches as scaling
|
||||
out compute hosts directly affects data center infrastructure resources as
|
||||
would any other infrastructure expansion.
|
||||
|
||||
While not as common in large enterprises, compute host components can also be
|
||||
upgraded to account for increases in
|
||||
demand, known as vertical scaling. Upgrading CPUs with more
|
||||
cores, or increasing the overall server memory, can add extra needed
|
||||
capacity depending on whether the running applications are more CPU
|
||||
intensive or memory intensive. We recommend a rolling upgrade of compute
|
||||
nodes for redundancy and availability.
|
||||
After the upgrade, when compute nodes return to the OpenStack cluster, they
|
||||
will be re-scanned and the new resources will be discovered adjusted in the
|
||||
OpenStack database.
|
||||
|
||||
When selecting a processor, compare features and performance
|
||||
characteristics. Some processors include features specific to
|
||||
virtualized compute hosts, such as hardware-assisted virtualization, and
|
||||
technology related to memory paging (also known as EPT shadowing). These
|
||||
types of features can have a significant impact on the performance of
|
||||
your virtual machine.
|
||||
|
||||
The number of processor cores and threads impacts the number of worker
|
||||
threads which can be run on a resource node. Design decisions must
|
||||
relate directly to the service being run on it, as well as provide a
|
||||
balanced infrastructure for all services.
|
||||
|
||||
Another option is to assess the average workloads and increase the
|
||||
number of instances that can run within the compute environment by
|
||||
adjusting the overcommit ratio. This ratio is configurable for CPU and
|
||||
memory. The default CPU overcommit ratio is 16:1, and the default memory
|
||||
overcommit ratio is 1.5:1. Determining the tuning of the overcommit
|
||||
ratios during the design phase is important as it has a direct impact on
|
||||
the hardware layout of your compute nodes.
|
||||
|
||||
.. note::
|
||||
|
||||
Changing the CPU overcommit ratio can have a detrimental effect
|
||||
and cause a potential increase in a noisy neighbor.
|
||||
|
||||
Insufficient disk capacity could also have a negative effect on overall
|
||||
performance including CPU and memory usage. Depending on the back end
|
||||
architecture of the OpenStack Block Storage layer, capacity includes
|
||||
adding disk shelves to enterprise storage systems or installing
|
||||
additional Block Storage nodes. Upgrading directly attached storage
|
||||
installed in Compute hosts, and adding capacity to the shared storage
|
||||
for additional ephemeral storage to instances, may be necessary.
|
||||
|
||||
Consider the Compute requirements of non-hypervisor nodes (also referred to as
|
||||
resource nodes). This includes controller, Object Storage nodes, Block Storage
|
||||
nodes, and networking services.
|
||||
|
||||
The ability to create pools or availability zones for unpredictable workloads
|
||||
should be considered. In some cases, the demand for certain instance types or
|
||||
flavors may not justify individual hardware design. Allocate hardware designs
|
||||
that are capable of servicing the most common instance requests. Adding
|
||||
hardware to the overall architecture can be done later.
|
@ -1,85 +0,0 @@
|
||||
.. _choosing-a-cpu:
|
||||
|
||||
==============
|
||||
Choosing a CPU
|
||||
==============
|
||||
|
||||
The type of CPU in your compute node is a very important decision. You must
|
||||
ensure that the CPU supports virtualization by way of *VT-x* for Intel chips
|
||||
and *AMD-v* for AMD chips.
|
||||
|
||||
.. tip::
|
||||
|
||||
Consult the vendor documentation to check for virtualization support. For
|
||||
Intel CPUs, see
|
||||
`Does my processor support Intel® Virtualization Technology?
|
||||
<https://www.intel.com/content/www/us/en/support/processors/000005486.html>`_. For AMD CPUs,
|
||||
see `AMD Virtualization
|
||||
<https://www.amd.com/en-us/innovations/software-technologies/server-solution/virtualization>`_.
|
||||
Your CPU may support virtualization but it may be disabled. Consult your
|
||||
BIOS documentation for how to enable CPU features.
|
||||
|
||||
The number of cores that the CPU has also affects your decision. It is
|
||||
common for current CPUs to have up to 24 cores. Additionally, if an Intel CPU
|
||||
supports hyper-threading, those 24 cores are doubled to 48 cores. If you
|
||||
purchase a server that supports multiple CPUs, the number of cores is further
|
||||
multiplied.
|
||||
|
||||
As of the Kilo release, key enhancements have been added to the
|
||||
OpenStack code to improve guest performance. These improvements allow the
|
||||
Compute service to take advantage of greater insight into a compute host's
|
||||
physical layout and therefore make smarter decisions regarding workload
|
||||
placement. Administrators can use this functionality to enable smarter planning
|
||||
choices for use cases like NFV (Network Function Virtualization) and HPC (High
|
||||
Performance Computing).
|
||||
|
||||
Considering non-uniform memory access (NUMA) is important when selecting CPU
|
||||
sizes and types, as there are use cases that use NUMA pinning to reserve host
|
||||
cores for operating system processes. These reduce the available CPU for
|
||||
workloads and protects the operating system.
|
||||
|
||||
.. tip::
|
||||
|
||||
When CPU pinning is requested for a guest, it is assumed
|
||||
there is no overcommit (or, an overcommit ratio of 1.0). When dedicated
|
||||
resourcing is not requested for a workload, the normal overcommit ratios
|
||||
are applied.
|
||||
|
||||
Therefore, we recommend that host aggregates are used to separate not
|
||||
only bare metal hosts, but hosts that will provide resources for workloads
|
||||
that require dedicated resources. This said, when workloads are provisioned
|
||||
to NUMA host aggregates, NUMA nodes are chosen at random and vCPUs can float
|
||||
across NUMA nodes on a host. If workloads require SR-IOV or DPDK, they should
|
||||
be assigned to a NUMA node aggregate with hosts that supply the
|
||||
functionality. More importantly, the workload or vCPUs that are executing
|
||||
processes for a workload should be on the same NUMA node due to the limited
|
||||
amount of cross-node memory bandwidth. In all cases, the ``NUMATopologyFilter``
|
||||
must be enabled for ``nova-scheduler``.
|
||||
|
||||
Additionally, CPU selection may not be one-size-fits-all across enterprises,
|
||||
but more of a list of SKUs that are tuned for the enterprise workloads.
|
||||
|
||||
For more information about NUMA, see `CPU topologies
|
||||
<https://docs.openstack.org/admin-guide/compute-cpu-topologies.html>`_ in
|
||||
the Administrator Guide.
|
||||
|
||||
In order to take advantage of these new enhancements in the Compute service,
|
||||
compute hosts must be using NUMA capable CPUs.
|
||||
|
||||
.. tip::
|
||||
|
||||
**Multithread Considerations**
|
||||
|
||||
Hyper-Threading is Intel's proprietary simultaneous multithreading
|
||||
implementation used to improve parallelization on their CPUs. You might
|
||||
consider enabling Hyper-Threading to improve the performance of
|
||||
multithreaded applications.
|
||||
|
||||
Whether you should enable Hyper-Threading on your CPUs depends upon your use
|
||||
case. For example, disabling Hyper-Threading can be beneficial in intense
|
||||
computing environments. We recommend performance testing with your local
|
||||
workload with both Hyper-Threading on and off to determine what is more
|
||||
appropriate in your case.
|
||||
|
||||
In most cases, hyper-threading CPUs can provide a 1.3x to 2.0x performance
|
||||
benefit over non-hyper-threaded CPUs depending on types of workload.
|
@ -1,165 +0,0 @@
|
||||
========================
|
||||
Choosing server hardware
|
||||
========================
|
||||
|
||||
Consider the following factors when selecting compute server hardware:
|
||||
|
||||
* Server density
|
||||
A measure of how many servers can fit into a given measure of
|
||||
physical space, such as a rack unit [U].
|
||||
|
||||
* Resource capacity
|
||||
The number of CPU cores, how much RAM, or how much storage a given
|
||||
server delivers.
|
||||
|
||||
* Expandability
|
||||
The number of additional resources you can add to a server before it
|
||||
reaches capacity.
|
||||
|
||||
* Cost
|
||||
The relative cost of the hardware weighed against the total amount of
|
||||
capacity available on the hardware based on predetermined requirements.
|
||||
|
||||
Weigh these considerations against each other to determine the best design for
|
||||
the desired purpose. For example, increasing server density means sacrificing
|
||||
resource capacity or expandability. It also can decrease availability and
|
||||
increase the chance of noisy neighbor issues. Increasing resource capacity and
|
||||
expandability can increase cost but decrease server density. Decreasing cost
|
||||
often means decreasing supportability, availability, server density, resource
|
||||
capacity, and expandability.
|
||||
|
||||
Determine the requirements for the cloud prior to constructing the cloud,
|
||||
and plan for hardware lifecycles, and expansion and new features that may
|
||||
require different hardware.
|
||||
|
||||
If the cloud is initially built with near end of life, but cost effective
|
||||
hardware, then the performance and capacity demand of new workloads will drive
|
||||
the purchase of more modern hardware. With individual hardware components
|
||||
changing over time, you may prefer to manage configurations as stock keeping
|
||||
units (SKU)s. This method provides an enterprise with a standard
|
||||
configuration unit of compute (server) that can be placed in any IT service
|
||||
manager or vendor supplied ordering system that can be triggered manually or
|
||||
through advanced operational automations. This simplifies ordering,
|
||||
provisioning, and activating additional compute resources. For example, there
|
||||
are plug-ins for several commercial service management tools that enable
|
||||
integration with hardware APIs. These configure and activate new compute
|
||||
resources from standby hardware based on a standard configurations. Using this
|
||||
methodology, spare hardware can be ordered for a datacenter and provisioned
|
||||
based on capacity data derived from OpenStack Telemetry.
|
||||
|
||||
Compute capacity (CPU cores and RAM capacity) is a secondary consideration for
|
||||
selecting server hardware. The required server hardware must supply adequate
|
||||
CPU sockets, additional CPU cores, and adequate RA. For more information, see
|
||||
:ref:`choosing-a-cpu`.
|
||||
|
||||
In compute server architecture design, you must also consider network and
|
||||
storage requirements. For more information on network considerations, see
|
||||
:ref:`network-design`.
|
||||
|
||||
Considerations when choosing hardware
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Here are some other factors to consider when selecting hardware for your
|
||||
compute servers.
|
||||
|
||||
Instance density
|
||||
----------------
|
||||
|
||||
More hosts are required to support the anticipated scale
|
||||
if the design architecture uses dual-socket hardware designs.
|
||||
|
||||
For a general purpose OpenStack cloud, sizing is an important consideration.
|
||||
The expected or anticipated number of instances that each hypervisor can
|
||||
host is a common meter used in sizing the deployment. The selected server
|
||||
hardware needs to support the expected or anticipated instance density.
|
||||
|
||||
Host density
|
||||
------------
|
||||
|
||||
Another option to address the higher host count is to use a
|
||||
quad-socket platform. Taking this approach decreases host density
|
||||
which also increases rack count. This configuration affects the
|
||||
number of power connections and also impacts network and cooling
|
||||
requirements.
|
||||
|
||||
Physical data centers have limited physical space, power, and
|
||||
cooling. The number of hosts (or hypervisors) that can be fitted
|
||||
into a given metric (rack, rack unit, or floor tile) is another
|
||||
important method of sizing. Floor weight is an often overlooked
|
||||
consideration.
|
||||
|
||||
The data center floor must be able to support the weight of the proposed number
|
||||
of hosts within a rack or set of racks. These factors need to be applied as
|
||||
part of the host density calculation and server hardware selection.
|
||||
|
||||
Power and cooling density
|
||||
-------------------------
|
||||
|
||||
The power and cooling density requirements might be lower than with
|
||||
blade, sled, or 1U server designs due to lower host density (by
|
||||
using 2U, 3U or even 4U server designs). For data centers with older
|
||||
infrastructure, this might be a desirable feature.
|
||||
|
||||
Data centers have a specified amount of power fed to a given rack or
|
||||
set of racks. Older data centers may have power densities as low as 20A per
|
||||
rack, and current data centers can be designed to support power densities as
|
||||
high as 120A per rack. The selected server hardware must take power density
|
||||
into account.
|
||||
|
||||
Selecting hardware form factor
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Consider the following in selecting server hardware form factor suited for
|
||||
your OpenStack design architecture:
|
||||
|
||||
* Most blade servers can support dual-socket multi-core CPUs. To avoid
|
||||
this CPU limit, select ``full width`` or ``full height`` blades. Be
|
||||
aware, however, that this also decreases server density. For example,
|
||||
high density blade servers such as HP BladeSystem or Dell PowerEdge
|
||||
M1000e support up to 16 servers in only ten rack units. Using
|
||||
half-height blades is twice as dense as using full-height blades,
|
||||
which results in only eight servers per ten rack units.
|
||||
|
||||
* 1U rack-mounted servers have the ability to offer greater server density
|
||||
than a blade server solution, but are often limited to dual-socket,
|
||||
multi-core CPU configurations. It is possible to place forty 1U servers
|
||||
in a rack, providing space for the top of rack (ToR) switches, compared
|
||||
to 32 full width blade servers.
|
||||
|
||||
To obtain greater than dual-socket support in a 1U rack-mount form
|
||||
factor, customers need to buy their systems from Original Design
|
||||
Manufacturers (ODMs) or second-tier manufacturers.
|
||||
|
||||
.. warning::
|
||||
|
||||
This may cause issues for organizations that have preferred
|
||||
vendor policies or concerns with support and hardware warranties
|
||||
of non-tier 1 vendors.
|
||||
|
||||
* 2U rack-mounted servers provide quad-socket, multi-core CPU support,
|
||||
but with a corresponding decrease in server density (half the density
|
||||
that 1U rack-mounted servers offer).
|
||||
|
||||
* Larger rack-mounted servers, such as 4U servers, often provide even
|
||||
greater CPU capacity, commonly supporting four or even eight CPU
|
||||
sockets. These servers have greater expandability, but such servers
|
||||
have much lower server density and are often more expensive.
|
||||
|
||||
* ``Sled servers`` are rack-mounted servers that support multiple
|
||||
independent servers in a single 2U or 3U enclosure. These deliver
|
||||
higher density as compared to typical 1U or 2U rack-mounted servers.
|
||||
For example, many sled servers offer four independent dual-socket
|
||||
nodes in 2U for a total of eight CPU sockets in 2U.
|
||||
|
||||
Scaling your cloud
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When designing a OpenStack cloud compute server architecture, you must
|
||||
decide whether you intend to scale up or scale out. Selecting a
|
||||
smaller number of larger hosts, or a larger number of smaller hosts,
|
||||
depends on a combination of factors: cost, power, cooling, physical rack
|
||||
and floor space, support-warranty, and manageability. Typically, the scale out
|
||||
model has been popular for OpenStack because it reduces the number of possible
|
||||
failure domains by spreading workloads across more infrastructure.
|
||||
However, the downside is the cost of additional servers and the datacenter
|
||||
resources needed to power, network, and cool the servers.
|
@ -1,46 +0,0 @@
|
||||
======================
|
||||
Choosing a hypervisor
|
||||
======================
|
||||
|
||||
A hypervisor provides software to manage virtual machine access to the
|
||||
underlying hardware. The hypervisor creates, manages, and monitors
|
||||
virtual machines. OpenStack Compute (nova) supports many hypervisors to various
|
||||
degrees, including:
|
||||
|
||||
* `Ironic <https://docs.openstack.org/ironic/latest/>`_
|
||||
* `KVM <https://www.linux-kvm.org/page/Main_Page>`_
|
||||
* `LXC <https://linuxcontainers.org/>`_
|
||||
* `QEMU <https://wiki.qemu.org/Main_Page>`_
|
||||
* `VMware ESX/ESXi <https://www.vmware.com/support/vsphere-hypervisor.html>`_
|
||||
* `Xen (using libvirt) <https://www.xenproject.org>`_
|
||||
* `XenServer <https://xenserver.org>`_
|
||||
* `Hyper-V
|
||||
<https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/hyper-v-technology-overview>`_
|
||||
* `PowerVM <https://www.ibm.com/us-en/marketplace/ibm-powervm>`_
|
||||
* `UML <http://user-mode-linux.sourceforge.net>`_
|
||||
* `Virtuozzo <https://www.virtuozzo.com/products/vz7.html>`_
|
||||
* `zVM <https://www.ibm.com/it-infrastructure/z/zvm>`_
|
||||
|
||||
An important factor in your choice of hypervisor is your current organization's
|
||||
hypervisor usage or experience. Also important is the hypervisor's feature
|
||||
parity, documentation, and the level of community experience.
|
||||
|
||||
As per the recent OpenStack user survey, KVM is the most widely adopted
|
||||
hypervisor in the OpenStack community. Besides KVM, there are many deployments
|
||||
that run other hypervisors such as LXC, VMware, Xen, and Hyper-V. However,
|
||||
these hypervisors are either less used, are niche hypervisors, or have limited
|
||||
functionality compared to more commonly used hypervisors.
|
||||
|
||||
.. note::
|
||||
|
||||
It is also possible to run multiple hypervisors in a single
|
||||
deployment using host aggregates or cells. However, an individual
|
||||
compute node can run only a single hypervisor at a time.
|
||||
|
||||
For more information about feature support for
|
||||
hypervisors as well as ironic and Virtuozzo (formerly Parallels), see
|
||||
`Hypervisor Support Matrix
|
||||
<https://docs.openstack.org/nova/latest/user/support-matrix.html>`_
|
||||
and `Hypervisors
|
||||
<https://docs.openstack.org/ocata/config-reference/compute/hypervisors.html>`_
|
||||
in the Configuration Reference.
|
@ -1,105 +0,0 @@
|
||||
======================
|
||||
Compute server logging
|
||||
======================
|
||||
|
||||
The logs on the compute nodes, or any server running nova-compute (for example
|
||||
in a hyperconverged architecture), are the primary points for troubleshooting
|
||||
issues with the hypervisor and compute services. Additionally, operating system
|
||||
logs can also provide useful information.
|
||||
|
||||
As the cloud environment grows, the amount of log data increases exponentially.
|
||||
Enabling debugging on either the OpenStack services or the operating system
|
||||
further compounds the data issues.
|
||||
|
||||
Logging is described in more detail in the `Logging and Monitoring
|
||||
<https://docs.openstack.org/operations-guide/ops-logging-monitoring.html>`_.
|
||||
However, it is an important design consideration to take into account before
|
||||
commencing operations of your cloud.
|
||||
|
||||
OpenStack produces a great deal of useful logging information, but for
|
||||
the information to be useful for operations purposes, you should consider
|
||||
having a central logging server to send logs to, and a log parsing/analysis
|
||||
system such as Elastic Stack [formerly known as ELK].
|
||||
|
||||
Elastic Stack consists of mainly three components: Elasticsearch (log search
|
||||
and analysis), Logstash (log intake, processing and output) and Kibana (log
|
||||
dashboard service).
|
||||
|
||||
.. figure:: ../figures/ELKbasicArch.png
|
||||
:align: center
|
||||
:alt: Elastic Search Basic Architecture
|
||||
|
||||
Due to the amount of logs being sent from servers in the OpenStack environment,
|
||||
an optional in-memory data structure store can be used. Common examples are
|
||||
Redis and Memcached. In newer versions of Elastic Stack, a file buffer called
|
||||
`Filebeat <https://www.elastic.co/products/beats/filebeat>`_ is used for a
|
||||
similar purpose but adds a "backpressure-sensitive" protocol when sending data
|
||||
to Logstash or Elasticsearch.
|
||||
|
||||
Log analysis often requires disparate logs of differing formats. Elastic
|
||||
Stack (namely Logstash) was created to take many different log inputs and
|
||||
transform them into a consistent format that Elasticsearch can catalog and
|
||||
analyze. As seen in the image above, the process of ingestion starts on the
|
||||
servers by Logstash, is forwarded to the Elasticsearch server for storage and
|
||||
searching, and then displayed through Kibana for visual analysis and
|
||||
interaction.
|
||||
|
||||
For instructions on installing Logstash, Elasticsearch and Kibana, see the
|
||||
`Elasticsearch reference
|
||||
<https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started.html>`_.
|
||||
|
||||
There are some specific configuration parameters that are needed to
|
||||
configure Logstash for OpenStack. For example, in order to get Logstash to
|
||||
collect, parse, and send the correct portions of log files to the Elasticsearch
|
||||
server, you need to format the configuration file properly. There
|
||||
are input, output and filter configurations. Input configurations tell Logstash
|
||||
where to receive data from (log files/forwarders/filebeats/StdIn/Eventlog),
|
||||
output configurations specify where to put the data, and filter configurations
|
||||
define the input contents to forward to the output.
|
||||
|
||||
The Logstash filter performs intermediary processing on each event. Conditional
|
||||
filters are applied based on the characteristics of the input and the event.
|
||||
Some examples of filtering are:
|
||||
|
||||
* grok
|
||||
* date
|
||||
* csv
|
||||
* json
|
||||
|
||||
There are also output filters available that send event data to many different
|
||||
destinations. Some examples are:
|
||||
|
||||
* csv
|
||||
* redis
|
||||
* elasticsearch
|
||||
* file
|
||||
* jira
|
||||
* nagios
|
||||
* pagerduty
|
||||
* stdout
|
||||
|
||||
Additionally there are several codecs that can be used to change the data
|
||||
representation of events such as:
|
||||
|
||||
* collectd
|
||||
* graphite
|
||||
* json
|
||||
* plan
|
||||
* rubydebug
|
||||
|
||||
These input, output and filter configurations are typically stored in
|
||||
:file:`/etc/logstash/conf.d` but may vary by linux distribution. Separate
|
||||
configuration files should be created for different logging systems such as
|
||||
syslog, Apache, and OpenStack.
|
||||
|
||||
General examples and configuration guides can be found on the Elastic `Logstash
|
||||
Configuration page
|
||||
<https://www.elastic.co/guide/en/logstash/current/configuration-file-structure.html>`_.
|
||||
|
||||
OpenStack input, output and filter examples can be found at
|
||||
`sorantis/elkstack
|
||||
<https://github.com/sorantis/elkstack/tree/master/elk/logstash>`_.
|
||||
|
||||
Once a configuration is complete, Kibana can be used as a visualization tool
|
||||
for OpenStack and system logging. This will allow operators to configure custom
|
||||
dashboards for performance, monitoring and security.
|
@ -1,51 +0,0 @@
|
||||
====================
|
||||
Network connectivity
|
||||
====================
|
||||
|
||||
The selected server hardware must have the appropriate number of network
|
||||
connections, as well as the right type of network connections, in order to
|
||||
support the proposed architecture. Ensure that, at a minimum, there are at
|
||||
least two diverse network connections coming into each rack.
|
||||
|
||||
The selection of form factors or architectures affects the selection of server
|
||||
hardware. Ensure that the selected server hardware is configured to support
|
||||
enough storage capacity (or storage expandability) to match the requirements of
|
||||
selected scale-out storage solution. Similarly, the network architecture
|
||||
impacts the server hardware selection and vice versa.
|
||||
|
||||
While each enterprise install is different, the following networks with their
|
||||
proposed bandwidth is highly recommended for a basic production OpenStack
|
||||
install.
|
||||
|
||||
**Install or OOB network** - Typically used by most distributions and
|
||||
provisioning tools as the network for deploying base software to the
|
||||
OpenStack compute nodes. This network should be connected at a minimum of 1Gb
|
||||
and no routing is usually needed.
|
||||
|
||||
**Internal or Management network** - Used as the internal communication network
|
||||
between OpenStack compute and control nodes. Can also be used as a network
|
||||
for iSCSI communication between the compute and iSCSI storage nodes. Again,
|
||||
this should be a minimum of a 1Gb NIC and should be a non-routed network. This
|
||||
interface should be redundant for high availability (HA).
|
||||
|
||||
**Tenant network** - A private network that enables communication between each
|
||||
tenant's instances. If using flat networking and provider networks, this
|
||||
network is optional. This network should also be isolated from all other
|
||||
networks for security compliance. A 1Gb interface should be sufficient and
|
||||
redundant for HA.
|
||||
|
||||
**Storage network** - A private network which could be connected to the Ceph
|
||||
frontend or other shared storage. For HA purposes this should be a redundant
|
||||
configuration with suggested 10Gb NICs. This network isolates the storage for
|
||||
the instances away from other networks. Under load, this storage traffic
|
||||
could overwhelm other networks and cause outages on other OpenStack services.
|
||||
|
||||
**(Optional) External or Public network** - This network is used to communicate
|
||||
externally from the VMs to the public network space. These addresses are
|
||||
typically handled by the neutron agent on the controller nodes and can also
|
||||
be handled by a SDN other than neutron. However, when using neutron DVR with
|
||||
OVS, this network must be present on the compute node since north and south
|
||||
traffic will not be handled by the controller nodes, but by the compute node
|
||||
itself. For more information on DVR with OVS and compute nodes, see
|
||||
`Open vSwitch: High availability using DVR
|
||||
<https://docs.openstack.org/ocata/networking-guide/deploy-ovs-ha-dvr.html>`_
|
@ -1,48 +0,0 @@
|
||||
==========================
|
||||
Overcommitting CPU and RAM
|
||||
==========================
|
||||
|
||||
OpenStack allows you to overcommit CPU and RAM on compute nodes. This
|
||||
allows you to increase the number of instances running on your cloud at the
|
||||
cost of reducing the performance of the instances. The Compute service uses the
|
||||
following ratios by default:
|
||||
|
||||
* CPU allocation ratio: 16:1
|
||||
* RAM allocation ratio: 1.5:1
|
||||
|
||||
The default CPU allocation ratio of 16:1 means that the scheduler
|
||||
allocates up to 16 virtual cores per physical core. For example, if a
|
||||
physical node has 12 cores, the scheduler sees 192 available virtual
|
||||
cores. With typical flavor definitions of 4 virtual cores per instance,
|
||||
this ratio would provide 48 instances on a physical node.
|
||||
|
||||
The formula for the number of virtual instances on a compute node is
|
||||
``(OR*PC)/VC``, where:
|
||||
|
||||
OR
|
||||
CPU overcommit ratio (virtual cores per physical core)
|
||||
|
||||
PC
|
||||
Number of physical cores
|
||||
|
||||
VC
|
||||
Number of virtual cores per instance
|
||||
|
||||
Similarly, the default RAM allocation ratio of 1.5:1 means that the
|
||||
scheduler allocates instances to a physical node as long as the total
|
||||
amount of RAM associated with the instances is less than 1.5 times the
|
||||
amount of RAM available on the physical node.
|
||||
|
||||
For example, if a physical node has 48 GB of RAM, the scheduler
|
||||
allocates instances to that node until the sum of the RAM associated
|
||||
with the instances reaches 72 GB (such as nine instances, in the case
|
||||
where each instance has 8 GB of RAM).
|
||||
|
||||
.. note::
|
||||
|
||||
Regardless of the overcommit ratio, an instance can not be placed
|
||||
on any physical node with fewer raw (pre-overcommit) resources than
|
||||
the instance flavor requires.
|
||||
|
||||
You must select the appropriate CPU and RAM allocation ratio for your
|
||||
particular use case.
|
@ -1,154 +0,0 @@
|
||||
==========================
|
||||
Instance storage solutions
|
||||
==========================
|
||||
|
||||
As part of the architecture design for a compute cluster, you must specify
|
||||
storage for the disk on which the instantiated instance runs. There are three
|
||||
main approaches to providing temporary storage:
|
||||
|
||||
* Off compute node storage—shared file system
|
||||
* On compute node storage—shared file system
|
||||
* On compute node storage—nonshared file system
|
||||
|
||||
In general, the questions you should ask when selecting storage are as
|
||||
follows:
|
||||
|
||||
* What are my workloads?
|
||||
* Do my workloads have IOPS requirements?
|
||||
* Are there read, write, or random access performance requirements?
|
||||
* What is my forecast for the scaling of storage for compute?
|
||||
* What storage is my enterprise currently using? Can it be re-purposed?
|
||||
* How do I manage the storage operationally?
|
||||
|
||||
Many operators use separate compute and storage hosts instead of a
|
||||
hyperconverged solution. Compute services and storage services have different
|
||||
requirements, and compute hosts typically require more CPU and RAM than storage
|
||||
hosts. Therefore, for a fixed budget, it makes sense to have different
|
||||
configurations for your compute nodes and your storage nodes. Compute nodes
|
||||
will be invested in CPU and RAM, and storage nodes will be invested in block
|
||||
storage.
|
||||
|
||||
However, if you are more restricted in the number of physical hosts you have
|
||||
available for creating your cloud and you want to be able to dedicate as many
|
||||
of your hosts as possible to running instances, it makes sense to run compute
|
||||
and storage on the same machines or use an existing storage array that is
|
||||
available.
|
||||
|
||||
The three main approaches to instance storage are provided in the next
|
||||
few sections.
|
||||
|
||||
Non-compute node based shared file system
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In this option, the disks storing the running instances are hosted in
|
||||
servers outside of the compute nodes.
|
||||
|
||||
If you use separate compute and storage hosts, you can treat your
|
||||
compute hosts as "stateless". As long as you do not have any instances
|
||||
currently running on a compute host, you can take it offline or wipe it
|
||||
completely without having any effect on the rest of your cloud. This
|
||||
simplifies maintenance for the compute hosts.
|
||||
|
||||
There are several advantages to this approach:
|
||||
|
||||
* If a compute node fails, instances are usually easily recoverable.
|
||||
* Running a dedicated storage system can be operationally simpler.
|
||||
* You can scale to any number of spindles.
|
||||
* It may be possible to share the external storage for other purposes.
|
||||
|
||||
The main disadvantages to this approach are:
|
||||
|
||||
* Depending on design, heavy I/O usage from some instances can affect
|
||||
unrelated instances.
|
||||
* Use of the network can decrease performance.
|
||||
* Scalability can be affected by network architecture.
|
||||
|
||||
On compute node storage—shared file system
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In this option, each compute node is specified with a significant amount
|
||||
of disk space, but a distributed file system ties the disks from each
|
||||
compute node into a single mount.
|
||||
|
||||
The main advantage of this option is that it scales to external storage
|
||||
when you require additional storage.
|
||||
|
||||
However, this option has several disadvantages:
|
||||
|
||||
* Running a distributed file system can make you lose your data
|
||||
locality compared with nonshared storage.
|
||||
* Recovery of instances is complicated by depending on multiple hosts.
|
||||
* The chassis size of the compute node can limit the number of spindles
|
||||
able to be used in a compute node.
|
||||
* Use of the network can decrease performance.
|
||||
* Loss of compute nodes decreases storage availability for all hosts.
|
||||
|
||||
On compute node storage—nonshared file system
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In this option, each compute node is specified with enough disks to store the
|
||||
instances it hosts.
|
||||
|
||||
There are two main advantages:
|
||||
|
||||
* Heavy I/O usage on one compute node does not affect instances on other
|
||||
compute nodes. Direct I/O access can increase performance.
|
||||
* Each host can have different storage profiles for hosts aggregation and
|
||||
availability zones.
|
||||
|
||||
There are several disadvantages:
|
||||
|
||||
* If a compute node fails, the data associated with the instances running on
|
||||
that node is lost.
|
||||
* The chassis size of the compute node can limit the number of spindles
|
||||
able to be used in a compute node.
|
||||
* Migrations of instances from one node to another are more complicated
|
||||
and rely on features that may not continue to be developed.
|
||||
* If additional storage is required, this option does not scale.
|
||||
|
||||
Running a shared file system on a storage system apart from the compute nodes
|
||||
is ideal for clouds where reliability and scalability are the most important
|
||||
factors. Running a shared file system on the compute nodes themselves may be
|
||||
best in a scenario where you have to deploy to pre-existing servers for which
|
||||
you have little to no control over their specifications or have specific
|
||||
storage performance needs but do not have a need for persistent storage.
|
||||
|
||||
Issues with live migration
|
||||
--------------------------
|
||||
|
||||
Live migration is an integral part of the operations of the
|
||||
cloud. This feature provides the ability to seamlessly move instances
|
||||
from one physical host to another, a necessity for performing upgrades
|
||||
that require reboots of the compute hosts, but only works well with
|
||||
shared storage.
|
||||
|
||||
Live migration can also be done with non-shared storage, using a feature
|
||||
known as *KVM live block migration*. While an earlier implementation of
|
||||
block-based migration in KVM and QEMU was considered unreliable, there
|
||||
is a newer, more reliable implementation of block-based live migration
|
||||
as of the Mitaka release.
|
||||
|
||||
Live migration and block migration still have some issues:
|
||||
|
||||
* Error reporting has received some attention in Mitaka and Newton but there
|
||||
are improvements needed.
|
||||
* Live migration resource tracking issues.
|
||||
* Live migration of rescued images.
|
||||
|
||||
Choice of file system
|
||||
---------------------
|
||||
|
||||
If you want to support shared-storage live migration, you need to
|
||||
configure a distributed file system.
|
||||
|
||||
Possible options include:
|
||||
|
||||
* NFS (default for Linux)
|
||||
* Ceph
|
||||
* GlusterFS
|
||||
* MooseFS
|
||||
* Lustre
|
||||
|
||||
We recommend that you choose the option operators are most familiar with.
|
||||
NFS is the easiest to set up and there is extensive community knowledge
|
||||
about it.
|
@ -1,413 +0,0 @@
|
||||
==========================
|
||||
Control plane architecture
|
||||
==========================
|
||||
|
||||
.. From Ops Guide chapter: Designing for Cloud Controllers and Cloud
|
||||
Management
|
||||
|
||||
OpenStack is designed to be massively horizontally scalable, which
|
||||
allows all services to be distributed widely. However, to simplify this
|
||||
guide, we have decided to discuss services of a more central nature,
|
||||
using the concept of a *cloud controller*. A cloud controller is a
|
||||
conceptual simplification. In the real world, you design an architecture
|
||||
for your cloud controller that enables high availability so that if any
|
||||
node fails, another can take over the required tasks. In reality, cloud
|
||||
controller tasks are spread out across more than a single node.
|
||||
|
||||
The cloud controller provides the central management system for
|
||||
OpenStack deployments. Typically, the cloud controller manages
|
||||
authentication and sends messaging to all the systems through a message
|
||||
queue.
|
||||
|
||||
For many deployments, the cloud controller is a single node. However, to
|
||||
have high availability, you have to take a few considerations into
|
||||
account, which we'll cover in this chapter.
|
||||
|
||||
The cloud controller manages the following services for the cloud:
|
||||
|
||||
Databases
|
||||
Tracks current information about users and instances, for example,
|
||||
in a database, typically one database instance managed per service
|
||||
|
||||
Message queue services
|
||||
All :term:`Advanced Message Queuing Protocol (AMQP)` messages for
|
||||
services are received and sent according to the queue broker
|
||||
|
||||
Conductor services
|
||||
Proxy requests to a database
|
||||
|
||||
Authentication and authorization for identity management
|
||||
Indicates which users can do what actions on certain cloud
|
||||
resources; quota management is spread out among services,
|
||||
howeverauthentication
|
||||
|
||||
Image-management services
|
||||
Stores and serves images with metadata on each, for launching in the
|
||||
cloud
|
||||
|
||||
Scheduling services
|
||||
Indicates which resources to use first; for example, spreading out
|
||||
where instances are launched based on an algorithm
|
||||
|
||||
User dashboard
|
||||
Provides a web-based front end for users to consume OpenStack cloud
|
||||
services
|
||||
|
||||
API endpoints
|
||||
Offers each service's REST API access, where the API endpoint
|
||||
catalog is managed by the Identity service
|
||||
|
||||
For our example, the cloud controller has a collection of ``nova-*``
|
||||
components that represent the global state of the cloud; talks to
|
||||
services such as authentication; maintains information about the cloud
|
||||
in a database; communicates to all compute nodes and storage
|
||||
:term:`workers <worker>` through a queue; and provides API access.
|
||||
Each service running on a designated cloud controller may be broken out
|
||||
into separate nodes for scalability or availability.
|
||||
|
||||
As another example, you could use pairs of servers for a collective
|
||||
cloud controller—one active, one standby—for redundant nodes providing a
|
||||
given set of related services, such as:
|
||||
|
||||
- Front end web for API requests, the scheduler for choosing which
|
||||
compute node to boot an instance on, Identity services, and the
|
||||
dashboard
|
||||
|
||||
- Database and message queue server (such as MySQL, RabbitMQ)
|
||||
|
||||
- Image service for the image management
|
||||
|
||||
Now that you see the myriad designs for controlling your cloud, read
|
||||
more about the further considerations to help with your design
|
||||
decisions.
|
||||
|
||||
Hardware Considerations
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
A cloud controller's hardware can be the same as a compute node, though
|
||||
you may want to further specify based on the size and type of cloud that
|
||||
you run.
|
||||
|
||||
It's also possible to use virtual machines for all or some of the
|
||||
services that the cloud controller manages, such as the message queuing.
|
||||
In this guide, we assume that all services are running directly on the
|
||||
cloud controller.
|
||||
|
||||
:ref:`table_controller_hardware` contains common considerations to
|
||||
review when sizing hardware for the cloud controller design.
|
||||
|
||||
.. _table_controller_hardware:
|
||||
|
||||
.. list-table:: Table. Cloud controller hardware sizing considerations
|
||||
:widths: 25 75
|
||||
:header-rows: 1
|
||||
|
||||
* - Consideration
|
||||
- Ramification
|
||||
* - How many instances will run at once?
|
||||
- Size your database server accordingly, and scale out beyond one cloud
|
||||
controller if many instances will report status at the same time and
|
||||
scheduling where a new instance starts up needs computing power.
|
||||
* - How many compute nodes will run at once?
|
||||
- Ensure that your messaging queue handles requests successfully and size
|
||||
accordingly.
|
||||
* - How many users will access the API?
|
||||
- If many users will make multiple requests, make sure that the CPU load
|
||||
for the cloud controller can handle it.
|
||||
* - How many users will access the dashboard versus the REST API directly?
|
||||
- The dashboard makes many requests, even more than the API access, so
|
||||
add even more CPU if your dashboard is the main interface for your users.
|
||||
* - How many ``nova-api`` services do you run at once for your cloud?
|
||||
- You need to size the controller with a core per service.
|
||||
* - How long does a single instance run?
|
||||
- Starting instances and deleting instances is demanding on the compute
|
||||
node but also demanding on the controller node because of all the API
|
||||
queries and scheduling needs.
|
||||
* - Does your authentication system also verify externally?
|
||||
- External systems such as :term:`LDAP <Lightweight Directory Access
|
||||
Protocol (LDAP)>` or :term:`Active Directory` require network
|
||||
connectivity between the cloud controller and an external authentication
|
||||
system. Also ensure that the cloud controller has the CPU power to keep
|
||||
up with requests.
|
||||
|
||||
|
||||
Separation of Services
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
While our example contains all central services in a single location, it
|
||||
is possible and indeed often a good idea to separate services onto
|
||||
different physical servers. :ref:`table_deployment_scenarios` is a list
|
||||
of deployment scenarios we've seen and their justifications.
|
||||
|
||||
.. _table_deployment_scenarios:
|
||||
|
||||
.. list-table:: Table. Deployment scenarios
|
||||
:widths: 25 75
|
||||
:header-rows: 1
|
||||
|
||||
* - Scenario
|
||||
- Justification
|
||||
* - Run ``glance-*`` servers on the ``swift-proxy`` server.
|
||||
- This deployment felt that the spare I/O on the Object Storage proxy
|
||||
server was sufficient and that the Image Delivery portion of glance
|
||||
benefited from being on physical hardware and having good connectivity
|
||||
to the Object Storage back end it was using.
|
||||
* - Run a central dedicated database server.
|
||||
- This deployment used a central dedicated server to provide the databases
|
||||
for all services. This approach simplified operations by isolating
|
||||
database server updates and allowed for the simple creation of slave
|
||||
database servers for failover.
|
||||
* - Run one VM per service.
|
||||
- This deployment ran central services on a set of servers running KVM.
|
||||
A dedicated VM was created for each service (``nova-scheduler``,
|
||||
rabbitmq, database, etc). This assisted the deployment with scaling
|
||||
because administrators could tune the resources given to each virtual
|
||||
machine based on the load it received (something that was not well
|
||||
understood during installation).
|
||||
* - Use an external load balancer.
|
||||
- This deployment had an expensive hardware load balancer in its
|
||||
organization. It ran multiple ``nova-api`` and ``swift-proxy``
|
||||
servers on different physical servers and used the load balancer
|
||||
to switch between them.
|
||||
|
||||
One choice that always comes up is whether to virtualize. Some services,
|
||||
such as ``nova-compute``, ``swift-proxy`` and ``swift-object`` servers,
|
||||
should not be virtualized. However, control servers can often be happily
|
||||
virtualized—the performance penalty can usually be offset by simply
|
||||
running more of the service.
|
||||
|
||||
Database
|
||||
~~~~~~~~
|
||||
|
||||
OpenStack Compute uses an SQL database to store and retrieve stateful
|
||||
information. MySQL is the popular database choice in the OpenStack
|
||||
community.
|
||||
|
||||
Loss of the database leads to errors. As a result, we recommend that you
|
||||
cluster your database to make it failure tolerant. Configuring and
|
||||
maintaining a database cluster is done outside OpenStack and is
|
||||
determined by the database software you choose to use in your cloud
|
||||
environment. MySQL/Galera is a popular option for MySQL-based databases.
|
||||
|
||||
Message Queue
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
Most OpenStack services communicate with each other using the *message
|
||||
queue*. For example, Compute communicates to block storage services and
|
||||
networking services through the message queue. Also, you can optionally
|
||||
enable notifications for any service. RabbitMQ, Qpid, and Zeromq are all
|
||||
popular choices for a message-queue service. In general, if the message
|
||||
queue fails or becomes inaccessible, the cluster grinds to a halt and
|
||||
ends up in a read-only state, with information stuck at the point where
|
||||
the last message was sent. Accordingly, we recommend that you cluster
|
||||
the message queue. Be aware that clustered message queues can be a pain
|
||||
point for many OpenStack deployments. While RabbitMQ has native
|
||||
clustering support, there have been reports of issues when running it at
|
||||
a large scale. While other queuing solutions are available, such as Zeromq
|
||||
and Qpid, Zeromq does not offer stateful queues. Qpid is the messaging
|
||||
system of choice for Red Hat and its derivatives. Qpid does not have
|
||||
native clustering capabilities and requires a supplemental service, such
|
||||
as Pacemaker or Corsync. For your message queue, you need to determine
|
||||
what level of data loss you are comfortable with and whether to use an
|
||||
OpenStack project's ability to retry multiple MQ hosts in the event of a
|
||||
failure, such as using Compute's ability to do so.
|
||||
|
||||
Conductor Services
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In the previous version of OpenStack, all ``nova-compute`` services
|
||||
required direct access to the database hosted on the cloud controller.
|
||||
This was problematic for two reasons: security and performance. With
|
||||
regard to security, if a compute node is compromised, the attacker
|
||||
inherently has access to the database. With regard to performance,
|
||||
``nova-compute`` calls to the database are single-threaded and blocking.
|
||||
This creates a performance bottleneck because database requests are
|
||||
fulfilled serially rather than in parallel.
|
||||
|
||||
The conductor service resolves both of these issues by acting as a proxy
|
||||
for the ``nova-compute`` service. Now, instead of ``nova-compute``
|
||||
directly accessing the database, it contacts the ``nova-conductor``
|
||||
service, and ``nova-conductor`` accesses the database on
|
||||
``nova-compute``'s behalf. Since ``nova-compute`` no longer has direct
|
||||
access to the database, the security issue is resolved. Additionally,
|
||||
``nova-conductor`` is a nonblocking service, so requests from all
|
||||
compute nodes are fulfilled in parallel.
|
||||
|
||||
.. note::
|
||||
|
||||
If you are using ``nova-network`` and multi-host networking in your
|
||||
cloud environment, ``nova-compute`` still requires direct access to
|
||||
the database.
|
||||
|
||||
The ``nova-conductor`` service is horizontally scalable. To make
|
||||
``nova-conductor`` highly available and fault tolerant, just launch more
|
||||
instances of the ``nova-conductor`` process, either on the same server
|
||||
or across multiple servers.
|
||||
|
||||
Application Programming Interface (API)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
All public access, whether direct, through a command-line client, or
|
||||
through the web-based dashboard, uses the API service. Find the API
|
||||
reference at `Development resources for OpenStack clouds
|
||||
<https://developer.openstack.org/>`_.
|
||||
|
||||
You must choose whether you want to support the Amazon EC2 compatibility
|
||||
APIs, or just the OpenStack APIs. One issue you might encounter when
|
||||
running both APIs is an inconsistent experience when referring to images
|
||||
and instances.
|
||||
|
||||
For example, the EC2 API refers to instances using IDs that contain
|
||||
hexadecimal, whereas the OpenStack API uses names and digits. Similarly,
|
||||
the EC2 API tends to rely on DNS aliases for contacting virtual
|
||||
machines, as opposed to OpenStack, which typically lists IP
|
||||
addresses.
|
||||
|
||||
If OpenStack is not set up in the right way, it is simple to have
|
||||
scenarios in which users are unable to contact their instances due to
|
||||
having only an incorrect DNS alias. Despite this, EC2 compatibility can
|
||||
assist users migrating to your cloud.
|
||||
|
||||
As with databases and message queues, having more than one :term:`API server`
|
||||
is a good thing. Traditional HTTP load-balancing techniques can be used to
|
||||
achieve a highly available ``nova-api`` service.
|
||||
|
||||
Extensions
|
||||
~~~~~~~~~~
|
||||
|
||||
The `API
|
||||
Specifications <https://developer.openstack.org/api-guide/quick-start/index.html>`_ define
|
||||
the core actions, capabilities, and mediatypes of the OpenStack API. A
|
||||
client can always depend on the availability of this core API, and
|
||||
implementers are always required to support it in its entirety.
|
||||
Requiring strict adherence to the core API allows clients to rely upon a
|
||||
minimal level of functionality when interacting with multiple
|
||||
implementations of the same API.
|
||||
|
||||
The OpenStack Compute API is extensible. An extension adds capabilities
|
||||
to an API beyond those defined in the core. The introduction of new
|
||||
features, MIME types, actions, states, headers, parameters, and
|
||||
resources can all be accomplished by means of extensions to the core
|
||||
API. This allows the introduction of new features in the API without
|
||||
requiring a version change and allows the introduction of
|
||||
vendor-specific niche functionality.
|
||||
|
||||
Scheduling
|
||||
~~~~~~~~~~
|
||||
|
||||
The scheduling services are responsible for determining the compute or
|
||||
storage node where a virtual machine or block storage volume should be
|
||||
created. The scheduling services receive creation requests for these
|
||||
resources from the message queue and then begin the process of
|
||||
determining the appropriate node where the resource should reside. This
|
||||
process is done by applying a series of user-configurable filters
|
||||
against the available collection of nodes.
|
||||
|
||||
There are currently two schedulers: ``nova-scheduler`` for virtual
|
||||
machines and ``cinder-scheduler`` for block storage volumes. Both
|
||||
schedulers are able to scale horizontally, so for high-availability
|
||||
purposes, or for very large or high-schedule-frequency installations,
|
||||
you should consider running multiple instances of each scheduler. The
|
||||
schedulers all listen to the shared message queue, so no special load
|
||||
balancing is required.
|
||||
|
||||
Images
|
||||
~~~~~~
|
||||
|
||||
The OpenStack Image service consists of two parts: ``glance-api`` and
|
||||
``glance-registry``. The former is responsible for the delivery of
|
||||
images; the compute node uses it to download images from the back end.
|
||||
The latter maintains the metadata information associated with virtual
|
||||
machine images and requires a database.
|
||||
|
||||
The ``glance-api`` part is an abstraction layer that allows a choice of
|
||||
back end. Currently, it supports:
|
||||
|
||||
OpenStack Object Storage
|
||||
Allows you to store images as objects.
|
||||
|
||||
File system
|
||||
Uses any traditional file system to store the images as files.
|
||||
|
||||
S3
|
||||
Allows you to fetch images from Amazon S3.
|
||||
|
||||
HTTP
|
||||
Allows you to fetch images from a web server. You cannot write
|
||||
images by using this mode.
|
||||
|
||||
If you have an OpenStack Object Storage service, we recommend using this
|
||||
as a scalable place to store your images. You can also use a file system
|
||||
with sufficient performance or Amazon S3—unless you do not need the
|
||||
ability to upload new images through OpenStack.
|
||||
|
||||
Dashboard
|
||||
~~~~~~~~~
|
||||
|
||||
The OpenStack dashboard (horizon) provides a web-based user interface to
|
||||
the various OpenStack components. The dashboard includes an end-user
|
||||
area for users to manage their virtual infrastructure and an admin area
|
||||
for cloud operators to manage the OpenStack environment as a
|
||||
whole.
|
||||
|
||||
The dashboard is implemented as a Python web application that normally
|
||||
runs in :term:`Apache` ``httpd``. Therefore, you may treat it the same as any
|
||||
other web application, provided it can reach the API servers (including
|
||||
their admin endpoints) over the network.
|
||||
|
||||
Authentication and Authorization
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The concepts supporting OpenStack's authentication and authorization are
|
||||
derived from well-understood and widely used systems of a similar
|
||||
nature. Users have credentials they can use to authenticate, and they
|
||||
can be a member of one or more groups (known as projects or tenants,
|
||||
interchangeably).
|
||||
|
||||
For example, a cloud administrator might be able to list all instances
|
||||
in the cloud, whereas a user can see only those in his current group.
|
||||
Resources quotas, such as the number of cores that can be used, disk
|
||||
space, and so on, are associated with a project.
|
||||
|
||||
OpenStack Identity provides authentication decisions and user attribute
|
||||
information, which is then used by the other OpenStack services to
|
||||
perform authorization. The policy is set in the ``policy.json`` file.
|
||||
For information on how to configure these, see `Managing Projects and Users
|
||||
<https://docs.openstack.org/operations-guide/ops-projects-users.html>`_ in the
|
||||
OpenStack Operations Guide.
|
||||
|
||||
OpenStack Identity supports different plug-ins for authentication
|
||||
decisions and identity storage. Examples of these plug-ins include:
|
||||
|
||||
- In-memory key-value Store (a simplified internal storage structure)
|
||||
|
||||
- SQL database (such as MySQL or PostgreSQL)
|
||||
|
||||
- Memcached (a distributed memory object caching system)
|
||||
|
||||
- LDAP (such as OpenLDAP or Microsoft's Active Directory)
|
||||
|
||||
Many deployments use the SQL database; however, LDAP is also a popular
|
||||
choice for those with existing authentication infrastructure that needs
|
||||
to be integrated.
|
||||
|
||||
Network Considerations
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Because the cloud controller handles so many different services, it must
|
||||
be able to handle the amount of traffic that hits it. For example, if
|
||||
you choose to host the OpenStack Image service on the cloud controller,
|
||||
the cloud controller should be able to support the transferring of the
|
||||
images at an acceptable speed.
|
||||
|
||||
As another example, if you choose to use single-host networking where
|
||||
the cloud controller is the network gateway for all instances, then the
|
||||
cloud controller must support the total amount of traffic that travels
|
||||
between your cloud and the public Internet.
|
||||
|
||||
We recommend that you use a fast NIC, such as 10 GB. You can also choose
|
||||
to use two 10 GB NICs and bond them together. While you might not be
|
||||
able to get a full bonded 20 GB speed, different transmission streams
|
||||
use different NICs. For example, if the cloud controller transfers two
|
||||
images, each image uses a different NIC and gets a full 10 GB of
|
||||
bandwidth.
|
@ -1,3 +0,0 @@
|
||||
=====================
|
||||
Identity architecture
|
||||
=====================
|
@ -1,3 +0,0 @@
|
||||
==========================
|
||||
Image Service architecture
|
||||
==========================
|
@ -1,31 +0,0 @@
|
||||
.. _network-design:
|
||||
|
||||
====================
|
||||
Network architecture
|
||||
====================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
design-networking/design-networking-concepts
|
||||
design-networking/design-networking-design
|
||||
design-networking/design-networking-services
|
||||
|
||||
OpenStack provides a rich networking environment. This chapter
|
||||
details the requirements and options to consider when designing your
|
||||
cloud. This includes examples of network implementations to
|
||||
consider, information about some OpenStack network layouts and networking
|
||||
services that are essential for stable operation.
|
||||
|
||||
.. warning::
|
||||
|
||||
If this is the first time you are deploying a cloud infrastructure
|
||||
in your organization, your first conversations should be with your
|
||||
networking team. Network usage in a running cloud is vastly different
|
||||
from traditional network deployments and has the potential to be
|
||||
disruptive at both a connectivity and a policy level.
|
||||
|
||||
For example, you must plan the number of IP addresses that you need for
|
||||
both your guest instances as well as management infrastructure.
|
||||
Additionally, you must research and discuss cloud network connectivity
|
||||
through proxy servers and firewalls.
|
@ -1,218 +0,0 @@
|
||||
===================
|
||||
Networking concepts
|
||||
===================
|
||||
|
||||
A cloud environment fundamentally changes the ways that networking is provided
|
||||
and consumed. Understanding the following concepts and decisions is imperative
|
||||
when making architectural decisions. For detailed information on networking
|
||||
concepts, see the `OpenStack Networking Guide
|
||||
<https://docs.openstack.org/ocata/networking-guide/>`_.
|
||||
|
||||
Network zones
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
The cloud networks are divided into a number of logical zones that support the
|
||||
network traffic flow requirements. We recommend defining at the least four
|
||||
distinct network zones.
|
||||
|
||||
Underlay
|
||||
--------
|
||||
|
||||
The underlay zone is defined as the physical network switching infrastructure
|
||||
that connects the storage, compute and control platforms. There are a large
|
||||
number of potential underlay options available.
|
||||
|
||||
Overlay
|
||||
-------
|
||||
|
||||
The overlay zone is defined as any L3 connectivity between the cloud components
|
||||
and could take the form of SDN solutions such as the neutron overlay solution
|
||||
or 3rd Party SDN solutions.
|
||||
|
||||
Edge
|
||||
----
|
||||
|
||||
The edge zone is where network traffic transitions from the cloud overlay or
|
||||
SDN networks into the traditional network environments.
|
||||
|
||||
External
|
||||
--------
|
||||
|
||||
The external network is defined as the configuration and components that are
|
||||
required to provide access to cloud resources and workloads, the external
|
||||
network is defined as all the components outside of the cloud edge gateways.
|
||||
|
||||
Traffic flow
|
||||
~~~~~~~~~~~~
|
||||
|
||||
There are two primary types of traffic flow within a cloud infrastructure, the
|
||||
choice of networking technologies is influenced by the expected loads.
|
||||
|
||||
East/West - The internal traffic flow between workload within the cloud as well
|
||||
as the traffic flow between the compute nodes and storage nodes falls into the
|
||||
East/West category. Generally this is the heaviest traffic flow and due to the
|
||||
need to cater for storage access needs to cater for a minimum of hops and low
|
||||
latency.
|
||||
|
||||
North/South - The flow of traffic between the workload and all external
|
||||
networks, including clients and remote services. This traffic flow is highly
|
||||
dependant on the workload within the cloud and the type of network services
|
||||
being offered.
|
||||
|
||||
Layer networking choices
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
There are several factors to take into consideration when deciding on whether
|
||||
to use Layer 2 networking architecture or a layer 3 networking architecture.
|
||||
For more information about OpenStack networking concepts, see the
|
||||
`OpenStack Networking <https://docs.openstack.org/ocata/networking-guide/intro-os-networking.html#>`_
|
||||
section in the OpenStack Networking Guide.
|
||||
|
||||
Benefits using a Layer-2 network
|
||||
--------------------------------
|
||||
|
||||
There are several reasons a network designed on layer-2 protocols is selected
|
||||
over a network designed on layer-3 protocols. In spite of the difficulties of
|
||||
using a bridge to perform the network role of a router, many vendors,
|
||||
customers, and service providers choose to use Ethernet in as many parts of
|
||||
their networks as possible. The benefits of selecting a layer-2 design are:
|
||||
|
||||
* Ethernet frames contain all the essentials for networking. These include, but
|
||||
are not limited to, globally unique source addresses, globally unique
|
||||
destination addresses, and error control.
|
||||
|
||||
* Ethernet frames can carry any kind of packet. Networking at layer-2 is
|
||||
independent of the layer-3 protocol.
|
||||
|
||||
* Adding more layers to the Ethernet frame only slows the networking process
|
||||
down. This is known as nodal processing delay.
|
||||
|
||||
* You can add adjunct networking features, for example class of service (CoS)
|
||||
or multicasting, to Ethernet as readily as IP networks.
|
||||
|
||||
* VLANs are an easy mechanism for isolating networks.
|
||||
|
||||
Most information starts and ends inside Ethernet frames. Today this applies
|
||||
to data, voice, and video. The concept is that the network will benefit more
|
||||
from the advantages of Ethernet if the transfer of information from a source
|
||||
to a destination is in the form of Ethernet frames.
|
||||
|
||||
Although it is not a substitute for IP networking, networking at layer-2 can
|
||||
be a powerful adjunct to IP networking.
|
||||
|
||||
Layer-2 Ethernet usage has additional benefits over layer-3 IP network usage:
|
||||
|
||||
* Speed
|
||||
* Reduced overhead of the IP hierarchy.
|
||||
* No need to keep track of address configuration as systems move around.
|
||||
|
||||
Whereas the simplicity of layer-2 protocols might work well in a data center
|
||||
with hundreds of physical machines, cloud data centers have the additional
|
||||
burden of needing to keep track of all virtual machine addresses and
|
||||
networks. In these data centers, it is not uncommon for one physical node
|
||||
to support 30-40 instances.
|
||||
|
||||
.. Important::
|
||||
|
||||
Networking at the frame level says nothing about the presence or
|
||||
absence of IP addresses at the packet level. Almost all ports, links, and
|
||||
devices on a network of LAN switches still have IP addresses, as do all the
|
||||
source and destination hosts. There are many reasons for the continued need
|
||||
for IP addressing. The largest one is the need to manage the network. A
|
||||
device or link without an IP address is usually invisible to most
|
||||
management applications. Utilities including remote access for diagnostics,
|
||||
file transfer of configurations and software, and similar applications
|
||||
cannot run without IP addresses as well as MAC addresses.
|
||||
|
||||
Layer-2 architecture limitations
|
||||
--------------------------------
|
||||
|
||||
Layer-2 network architectures have some limitations that become noticeable when
|
||||
used outside of traditional data centers.
|
||||
|
||||
* Number of VLANs is limited to 4096.
|
||||
* The number of MACs stored in switch tables is limited.
|
||||
* You must accommodate the need to maintain a set of layer-4 devices to handle
|
||||
traffic control.
|
||||
* MLAG, often used for switch redundancy, is a proprietary solution that does
|
||||
not scale beyond two devices and forces vendor lock-in.
|
||||
* It can be difficult to troubleshoot a network without IP addresses and ICMP.
|
||||
* Configuring ARP can be complicated on a large layer-2 networks.
|
||||
* All network devices need to be aware of all MACs, even instance MACs, so
|
||||
there is constant churn in MAC tables and network state changes as instances
|
||||
start and stop.
|
||||
* Migrating MACs (instance migration) to different physical locations are a
|
||||
potential problem if you do not set ARP table timeouts properly.
|
||||
|
||||
It is important to know that layer-2 has a very limited set of network
|
||||
management tools. It is difficult to control traffic as it does not have
|
||||
mechanisms to manage the network or shape the traffic. Network
|
||||
troubleshooting is also troublesome, in part because network devices have
|
||||
no IP addresses. As a result, there is no reasonable way to check network
|
||||
delay.
|
||||
|
||||
In a layer-2 network all devices are aware of all MACs, even those that belong
|
||||
to instances. The network state information in the backbone changes whenever an
|
||||
instance starts or stops. Because of this, there is far too much churn in the
|
||||
MAC tables on the backbone switches.
|
||||
|
||||
Furthermore, on large layer-2 networks, configuring ARP learning can be
|
||||
complicated. The setting for the MAC address timer on switches is critical
|
||||
and, if set incorrectly, can cause significant performance problems. So when
|
||||
migrating MACs to different physical locations to support instance migration,
|
||||
problems may arise. As an example, the Cisco default MAC address timer is
|
||||
extremely long. As such, the network information maintained in the switches
|
||||
could be out of sync with the new location of the instance.
|
||||
|
||||
Benefits using a Layer-3 network
|
||||
--------------------------------
|
||||
|
||||
In layer-3 networking, routing takes instance MAC and IP addresses out of the
|
||||
network core, reducing state churn. The only time there would be a routing
|
||||
state change is in the case of a Top of Rack (ToR) switch failure or a link
|
||||
failure in the backbone itself. Other advantages of using a layer-3
|
||||
architecture include:
|
||||
|
||||
* Layer-3 networks provide the same level of resiliency and scalability
|
||||
as the Internet.
|
||||
|
||||
* Controlling traffic with routing metrics is straightforward.
|
||||
|
||||
* You can configure layer-3 to use Border Gateway Protocol (BGP) confederation
|
||||
for scalability. This way core routers have state proportional to the number
|
||||
of racks, not to the number of servers or instances.
|
||||
|
||||
* There are a variety of well tested tools, such as Internet Control Message
|
||||
Protocol (ICMP) to monitor and manage traffic.
|
||||
|
||||
* Layer-3 architectures enable the use of :term:`quality of service (QoS)` to
|
||||
manage network performance.
|
||||
|
||||
Layer-3 architecture limitations
|
||||
--------------------------------
|
||||
|
||||
The main limitation of layer-3 networking is that there is no built-in
|
||||
isolation mechanism comparable to the VLANs in layer-2 networks. Furthermore,
|
||||
the hierarchical nature of IP addresses means that an instance is on the same
|
||||
subnet as its physical host, making migration out of the subnet difficult. For
|
||||
these reasons, network virtualization needs to use IP encapsulation and
|
||||
software at the end hosts. This is for isolation and the separation of the
|
||||
addressing in the virtual layer from the addressing in the physical layer.
|
||||
Other potential disadvantages of layer-3 networking include the need to design
|
||||
an IP addressing scheme rather than relying on the switches to keep track of
|
||||
the MAC addresses automatically, and to configure the interior gateway routing
|
||||
protocol in the switches.
|
||||
|
||||
Networking service (neutron)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
OpenStack Networking (neutron) is the component of OpenStack that provides
|
||||
the Networking service API and a reference architecture that implements a
|
||||
Software Defined Network (SDN) solution.
|
||||
|
||||
The Networking service provides full control over creation of virtual network
|
||||
resources to tenants. This is often accomplished in the form of tunneling
|
||||
protocols that establish encapsulated communication paths over existing
|
||||
network infrastructure in order to segment tenant traffic. This method varies
|
||||
depending on the specific implementation, but some of the more common methods
|
||||
include tunneling over GRE, encapsulating with VXLAN, and VLAN tags.
|
@ -1,281 +0,0 @@
|
||||
==============================
|
||||
Designing an OpenStack network
|
||||
==============================
|
||||
|
||||
There are many reasons an OpenStack network has complex requirements. One main
|
||||
factor is that many components interact at different levels of the system
|
||||
stack. Data flows are also complex.
|
||||
|
||||
Data in an OpenStack cloud moves between instances across the network
|
||||
(known as east-west traffic), as well as in and out of the system (known
|
||||
as north-south traffic). Physical server nodes have network requirements that
|
||||
are independent of instance network requirements and must be isolated to
|
||||
account for scalability. We recommend separating the networks for security
|
||||
purposes and tuning performance through traffic shaping.
|
||||
|
||||
You must consider a number of important technical and business requirements
|
||||
when planning and designing an OpenStack network:
|
||||
|
||||
* Avoid hardware or software vendor lock-in. The design should not rely on
|
||||
specific features of a vendor's network router or switch.
|
||||
* Massively scale the ecosystem to support millions of end users.
|
||||
* Support an indeterminate variety of platforms and applications.
|
||||
* Design for cost efficient operations to take advantage of massive scale.
|
||||
* Ensure that there is no single point of failure in the cloud ecosystem.
|
||||
* High availability architecture to meet customer SLA requirements.
|
||||
* Tolerant to rack level failure.
|
||||
* Maximize flexibility to architect future production environments.
|
||||
|
||||
Considering these requirements, we recommend the following:
|
||||
|
||||
* Design a Layer-3 network architecture rather than a layer-2 network
|
||||
architecture.
|
||||
* Design a dense multi-path network core to support multi-directional
|
||||
scaling and flexibility.
|
||||
* Use hierarchical addressing because it is the only viable option to scale
|
||||
a network ecosystem.
|
||||
* Use virtual networking to isolate instance service network traffic from the
|
||||
management and internal network traffic.
|
||||
* Isolate virtual networks using encapsulation technologies.
|
||||
* Use traffic shaping for performance tuning.
|
||||
* Use External Border Gateway Protocol (eBGP) to connect to the Internet
|
||||
up-link.
|
||||
* Use Internal Border Gateway Protocol (iBGP) to flatten the internal traffic
|
||||
on the layer-3 mesh.
|
||||
* Determine the most effective configuration for block storage network.
|
||||
|
||||
Additional network design considerations
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
There are several other considerations when designing a network-focused
|
||||
OpenStack cloud.
|
||||
|
||||
Redundant networking
|
||||
--------------------
|
||||
|
||||
You should conduct a high availability risk analysis to determine whether to
|
||||
use redundant switches such as Top of Rack (ToR) switches. In most cases, it
|
||||
is much more economical to use single switches with a small pool of spare
|
||||
switches to replace failed units than it is to outfit an entire data center
|
||||
with redundant switches. Applications should tolerate rack level outages
|
||||
without affecting normal operations since network and compute resources are
|
||||
easily provisioned and plentiful.
|
||||
|
||||
Research indicates the mean time between failures (MTBF) on switches is
|
||||
between 100,000 and 200,000 hours. This number is dependent on the ambient
|
||||
temperature of the switch in the data center. When properly cooled and
|
||||
maintained, this translates to between 11 and 22 years before failure. Even
|
||||
in the worst case of poor ventilation and high ambient temperatures in the data
|
||||
center, the MTBF is still 2-3 years.
|
||||
|
||||
.. Link to research findings?
|
||||
|
||||
.. TODO Legacy networking (nova-network)
|
||||
.. TODO OpenStack Networking
|
||||
.. TODO Simple, single agent
|
||||
.. TODO Complex, multiple agents
|
||||
.. TODO Flat or VLAN
|
||||
.. TODO Flat, VLAN, Overlays, L2-L3, SDN
|
||||
.. TODO No plug-in support
|
||||
.. TODO Plug-in support for 3rd parties
|
||||
.. TODO No multi-tier topologies
|
||||
.. TODO Multi-tier topologies
|
||||
.. What about network security? (DC)
|
||||
|
||||
Providing IPv6 support
|
||||
----------------------
|
||||
|
||||
One of the most important networking topics today is the exhaustion of
|
||||
IPv4 addresses. As of late 2015, ICANN announced that the final
|
||||
IPv4 address blocks have been fully assigned. Because of this, IPv6
|
||||
protocol has become the future of network focused applications. IPv6
|
||||
increases the address space significantly, fixes long standing issues
|
||||
in the IPv4 protocol, and will become essential for network focused
|
||||
applications in the future.
|
||||
|
||||
OpenStack Networking, when configured for it, supports IPv6. To enable
|
||||
IPv6, create an IPv6 subnet in Networking and use IPv6 prefixes when
|
||||
creating security groups.
|
||||
|
||||
Supporting asymmetric links
|
||||
---------------------------
|
||||
|
||||
When designing a network architecture, the traffic patterns of an
|
||||
application heavily influence the allocation of total bandwidth and
|
||||
the number of links that you use to send and receive traffic. Applications
|
||||
that provide file storage for customers allocate bandwidth and links to
|
||||
favor incoming traffic; whereas video streaming applications allocate
|
||||
bandwidth and links to favor outgoing traffic.
|
||||
|
||||
Optimizing network performance
|
||||
------------------------------
|
||||
|
||||
It is important to analyze the applications tolerance for latency and
|
||||
jitter when designing an environment to support network focused
|
||||
applications. Certain applications, for example VoIP, are less tolerant
|
||||
of latency and jitter. When latency and jitter are issues, certain
|
||||
applications may require tuning of QoS parameters and network device
|
||||
queues to ensure that they immediately queue for transmitting or guarantee
|
||||
minimum bandwidth. Since OpenStack currently does not support these functions,
|
||||
consider carefully your selected network plug-in.
|
||||
|
||||
The location of a service may also impact the application or consumer
|
||||
experience. If an application serves differing content to different users,
|
||||
it must properly direct connections to those specific locations. Where
|
||||
appropriate, use a multi-site installation for these situations.
|
||||
|
||||
You can implement networking in two separate ways. Legacy networking
|
||||
(nova-network) provides a flat DHCP network with a single broadcast domain.
|
||||
This implementation does not support tenant isolation networks or advanced
|
||||
plug-ins, but it is currently the only way to implement a distributed
|
||||
layer-3 (L3) agent using the multi-host configuration. The Networking service
|
||||
(neutron) is the official networking implementation and provides a pluggable
|
||||
architecture that supports a large variety of network methods. Some of these
|
||||
include a layer-2 only provider network model, external device plug-ins, or
|
||||
even OpenFlow controllers.
|
||||
|
||||
Networking at large scales becomes a set of boundary questions. The
|
||||
determination of how large a layer-2 domain must be is based on the
|
||||
number of nodes within the domain and the amount of broadcast traffic
|
||||
that passes between instances. Breaking layer-2 boundaries may require
|
||||
the implementation of overlay networks and tunnels. This decision is a
|
||||
balancing act between the need for a smaller overhead or a need for a smaller
|
||||
domain.
|
||||
|
||||
When selecting network devices, be aware that making a decision based on the
|
||||
greatest port density often comes with a drawback. Aggregation switches and
|
||||
routers have not all kept pace with ToR switches and may induce
|
||||
bottlenecks on north-south traffic. As a result, it may be possible for
|
||||
massive amounts of downstream network utilization to impact upstream network
|
||||
devices, impacting service to the cloud. Since OpenStack does not currently
|
||||
provide a mechanism for traffic shaping or rate limiting, it is necessary to
|
||||
implement these features at the network hardware level.
|
||||
|
||||
Using tunable networking components
|
||||
-----------------------------------
|
||||
|
||||
Consider configurable networking components related to an OpenStack
|
||||
architecture design when designing for network intensive workloads
|
||||
that include MTU and QoS. Some workloads require a larger MTU than normal
|
||||
due to the transfer of large blocks of data. When providing network
|
||||
service for applications such as video streaming or storage replication,
|
||||
we recommend that you configure both OpenStack hardware nodes and the
|
||||
supporting network equipment for jumbo frames where possible. This
|
||||
allows for better use of available bandwidth. Configure jumbo frames across the
|
||||
complete path the packets traverse. If one network component is not capable of
|
||||
handling jumbo frames then the entire path reverts to the default MTU.
|
||||
|
||||
:term:`Quality of Service (QoS)` also has a great impact on network intensive
|
||||
workloads as it provides instant service to packets which have a higher
|
||||
priority due to the impact of poor network performance. In applications such as
|
||||
Voice over IP (VoIP), differentiated services code points are a near
|
||||
requirement for proper operation. You can also use QoS in the opposite
|
||||
direction for mixed workloads to prevent low priority but high bandwidth
|
||||
applications, for example backup services, video conferencing, or file sharing,
|
||||
from blocking bandwidth that is needed for the proper operation of other
|
||||
workloads. It is possible to tag file storage traffic as a lower class, such as
|
||||
best effort or scavenger, to allow the higher priority traffic through. In
|
||||
cases where regions within a cloud might be geographically distributed it may
|
||||
also be necessary to plan accordingly to implement WAN optimization to combat
|
||||
latency or packet loss.
|
||||
|
||||
Choosing network hardware
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The network architecture determines which network hardware will be
|
||||
used. Networking software is determined by the selected networking
|
||||
hardware.
|
||||
|
||||
There are more subtle design impacts that need to be considered. The
|
||||
selection of certain networking hardware (and the networking software)
|
||||
affects the management tools that can be used. There are exceptions to
|
||||
this; the rise of *open* networking software that supports a range of
|
||||
networking hardware means there are instances where the relationship
|
||||
between networking hardware and networking software are not as tightly
|
||||
defined.
|
||||
|
||||
Some of the key considerations in the selection of networking hardware
|
||||
include:
|
||||
|
||||
Port count
|
||||
The design will require networking hardware that has the requisite
|
||||
port count.
|
||||
|
||||
Port density
|
||||
The network design will be affected by the physical space that is
|
||||
required to provide the requisite port count. A higher port density
|
||||
is preferred, as it leaves more rack space for compute or storage
|
||||
components. This can also lead into considerations about fault domains
|
||||
and power density. Higher density switches are more expensive, therefore
|
||||
it is important not to over design the network.
|
||||
|
||||
Port speed
|
||||
The networking hardware must support the proposed network speed, for
|
||||
example: 1 GbE, 10 GbE, or 40 GbE (or even 100 GbE).
|
||||
|
||||
Redundancy
|
||||
User requirements for high availability and cost considerations
|
||||
influence the level of network hardware redundancy. Network redundancy
|
||||
can be achieved by adding redundant power supplies or paired switches.
|
||||
|
||||
.. note::
|
||||
|
||||
Hardware must support network redundancy.
|
||||
|
||||
Power requirements
|
||||
Ensure that the physical data center provides the necessary power
|
||||
for the selected network hardware.
|
||||
|
||||
.. note::
|
||||
|
||||
This is not an issue for top of rack (ToR) switches. This may be an issue
|
||||
for spine switches in a leaf and spine fabric, or end of row (EoR)
|
||||
switches.
|
||||
|
||||
Protocol support
|
||||
It is possible to gain more performance out of a single storage
|
||||
system by using specialized network technologies such as RDMA, SRP,
|
||||
iSER and SCST. The specifics of using these technologies is beyond
|
||||
the scope of this book.
|
||||
|
||||
There is no single best practice architecture for the networking
|
||||
hardware supporting an OpenStack cloud. Some of the key factors that will
|
||||
have a major influence on selection of networking hardware include:
|
||||
|
||||
Connectivity
|
||||
All nodes within an OpenStack cloud require network connectivity. In
|
||||
some cases, nodes require access to more than one network segment.
|
||||
The design must encompass sufficient network capacity and bandwidth
|
||||
to ensure that all communications within the cloud, both north-south
|
||||
and east-west traffic, have sufficient resources available.
|
||||
|
||||
Scalability
|
||||
The network design should encompass a physical and logical network
|
||||
design that can be easily expanded upon. Network hardware should
|
||||
offer the appropriate types of interfaces and speeds that are
|
||||
required by the hardware nodes.
|
||||
|
||||
Availability
|
||||
To ensure access to nodes within the cloud is not interrupted,
|
||||
we recommend that the network architecture identifies any single
|
||||
points of failure and provides some level of redundancy or fault
|
||||
tolerance. The network infrastructure often involves use of
|
||||
networking protocols such as LACP, VRRP or others to achieve a highly
|
||||
available network connection. It is also important to consider the
|
||||
networking implications on API availability. We recommend a load balancing
|
||||
solution is designed within the network architecture to ensure that the APIs
|
||||
and potentially other services in the cloud are highly available.
|
||||
|
||||
Choosing networking software
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
OpenStack Networking (neutron) provides a wide variety of networking
|
||||
services for instances. There are many additional networking software
|
||||
packages that can be useful when managing OpenStack components. Some
|
||||
examples include:
|
||||
|
||||
- Software to provide load balancing
|
||||
- Network redundancy protocols
|
||||
- Routing daemons.
|
||||
|
||||
.. TODO Provide software examples
|
@ -1,70 +0,0 @@
|
||||
==============================
|
||||
Additional networking services
|
||||
==============================
|
||||
|
||||
OpenStack, like any network application, has a number of standard
|
||||
services to consider, such as NTP and DNS.
|
||||
|
||||
NTP
|
||||
~~~
|
||||
|
||||
Time synchronization is a critical element to ensure continued operation
|
||||
of OpenStack components. Ensuring that all components have the correct
|
||||
time is necessary to avoid errors in instance scheduling, replication of
|
||||
objects in the object store, and matching log timestamps for debugging.
|
||||
|
||||
All servers running OpenStack components should be able to access an
|
||||
appropriate NTP server. You may decide to set up one locally or use the
|
||||
public pools available from the `Network Time Protocol
|
||||
project <http://www.pool.ntp.org/>`_.
|
||||
|
||||
DNS
|
||||
~~~
|
||||
|
||||
Designate is a multi-tenant DNSaaS service for OpenStack. It provides a REST
|
||||
API with integrated keystone authentication. It can be configured to
|
||||
auto-generate records based on nova and neutron actions. Designate supports a
|
||||
variety of DNS servers including Bind9 and PowerDNS.
|
||||
|
||||
The DNS service provides DNS Zone and RecordSet management for OpenStack
|
||||
clouds. The DNS Service includes a REST API, a command-line client, and a
|
||||
horizon Dashboard plugin.
|
||||
|
||||
For more information, see the `Designate project <https://www.openstack.org/software/releases/ocata/components/designate>`_
|
||||
web page.
|
||||
|
||||
.. note::
|
||||
|
||||
The Designate service does not provide DNS service for the OpenStack
|
||||
infrastructure upon install. We recommend working with your service
|
||||
provider when installing OpenStack in order to properly name your
|
||||
servers and other infrastructure hardware.
|
||||
|
||||
DHCP
|
||||
~~~~
|
||||
|
||||
OpenStack neutron deploys various agents when a network is created within
|
||||
OpenStack. One of these agents is a DHCP agent. This DHCP agent uses the linux
|
||||
binary, dnsmasq as the delivery agent for DHCP. This agent manages the network
|
||||
namespaces that are spawned for each project subnet to act as a DHCP server.
|
||||
The dnsmasq process is capable of allocating IP addresses to all virtual
|
||||
machines running on a network. When a network is created through OpenStack and
|
||||
the DHCP agent is enabled for that network, DHCP services are enabled by
|
||||
default.
|
||||
|
||||
LBaaS
|
||||
~~~~~
|
||||
|
||||
OpenStack neutron has the ability to distribute incoming requests between
|
||||
designated instances. Using neutron networking and OVS, Load
|
||||
Balancing-as-a-Service (LBaaS) can be created. The load balancing of workloads
|
||||
is used to distribute incoming application requests evenly between designated
|
||||
instances. This operation ensures that a workload is shared predictably among
|
||||
defined instances and allows a more effective use of underlying resources.
|
||||
OpenStack LBaaS can distribute load in the following methods:
|
||||
|
||||
* Round robin - Even rotation between multiple defined instances.
|
||||
* Source IP - Requests from specific IPs are consistently directed to the same
|
||||
instance.
|
||||
* Least connections - Sends requests to the instance with the least number of
|
||||
active connections.
|
@ -1,13 +0,0 @@
|
||||
====================
|
||||
Storage architecture
|
||||
====================
|
||||
|
||||
Storage is found in many parts of the OpenStack cloud environment. This
|
||||
chapter describes storage type, design considerations and options when
|
||||
selecting persistent storage options for your cloud environment.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
design-storage/design-storage-concepts
|
||||
design-storage/design-storage-arch
|
@ -1,546 +0,0 @@
|
||||
====================
|
||||
Storage architecture
|
||||
====================
|
||||
|
||||
There are many different storage architectures available when designing an
|
||||
OpenStack cloud. The convergence of orchestration and automation within the
|
||||
OpenStack platform enables rapid storage provisioning without the hassle of
|
||||
the traditional manual processes like volume creation and
|
||||
attachment.
|
||||
|
||||
However, before choosing a storage architecture, a few generic questions should
|
||||
be answered:
|
||||
|
||||
* Will the storage architecture scale linearly as the cloud grows and what are
|
||||
its limits?
|
||||
* What is the desired attachment method: NFS, iSCSI, FC, or other?
|
||||
* Is the storage proven with the OpenStack platform?
|
||||
* What is the level of support provided by the vendor within the community?
|
||||
* What OpenStack features and enhancements does the cinder driver enable?
|
||||
* Does it include tools to help troubleshoot and resolve performance issues?
|
||||
* Is it interoperable with all of the projects you are planning on using
|
||||
in your cloud?
|
||||
|
||||
Choosing storage back ends
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Users will indicate different needs for their cloud architecture. Some may
|
||||
need fast access to many objects that do not change often, or want to
|
||||
set a time-to-live (TTL) value on a file. Others may access only storage
|
||||
that is mounted with the file system itself, but want it to be
|
||||
replicated instantly when starting a new instance. For other systems,
|
||||
ephemeral storage is the preferred choice. When you select
|
||||
:term:`storage back ends <storage back end>`,
|
||||
consider the following questions from user's perspective:
|
||||
|
||||
First and foremost:
|
||||
|
||||
* Do I need block storage?
|
||||
* Do I need object storage?
|
||||
* Do I need file-based storage?
|
||||
|
||||
Next answer the following:
|
||||
|
||||
* Do I need to support live migration?
|
||||
* Should my persistent storage drives be contained in my compute nodes,
|
||||
or should I use external storage?
|
||||
* What type of performance do I need in regards to IOPS? Total IOPS and IOPS
|
||||
per instance? Do I have applications with IOPS SLAs?
|
||||
* Are my storage needs mostly read, or write, or mixed?
|
||||
* Which storage choices result in the best cost-performance scenario I am
|
||||
aiming for?
|
||||
* How do I manage the storage operationally?
|
||||
* How redundant and distributed is the storage? What happens if a
|
||||
storage node fails? To what extent can it mitigate my data-loss disaster
|
||||
scenarios?
|
||||
* What is my company currently using and can I use it with OpenStack?
|
||||
* Do I need more than one storage choice? Do I need tiered performance storage?
|
||||
|
||||
While this is not a definitive list of all the questions possible, the list
|
||||
above will hopefully help narrow the list of possible storage choices down.
|
||||
|
||||
A wide variety of use case requirements dictate the nature of the storage
|
||||
back end. Examples of such requirements are as follows:
|
||||
|
||||
* Public, private, or a hybrid cloud (performance profiles, shared storage,
|
||||
replication options)
|
||||
* Storage-intensive use cases like HPC and Big Data clouds
|
||||
* Web-scale or development clouds where storage is typically ephemeral in
|
||||
nature
|
||||
|
||||
Data security recommendations:
|
||||
|
||||
* We recommend that data be encrypted both in transit and at-rest.
|
||||
To this end, carefully select disks, appliances, and software.
|
||||
Do not assume these features are included with all storage solutions.
|
||||
* Determine the security policy of your organization and understand
|
||||
the data sovereignty of your cloud geography and plan accordingly.
|
||||
|
||||
If you plan to use live migration, we highly recommend a shared storage
|
||||
configuration. This allows the operating system and application volumes
|
||||
for instances to reside outside of the compute nodes and adds significant
|
||||
performance increases when live migrating.
|
||||
|
||||
To deploy your storage by using only commodity hardware, you can use a number
|
||||
of open-source packages, as described in :ref:`table_persistent_file_storage`.
|
||||
|
||||
.. _table_persistent_file_storage:
|
||||
|
||||
.. list-table:: Persistent file-based storage support
|
||||
:widths: 25 25 25 25
|
||||
:header-rows: 1
|
||||
|
||||
* -
|
||||
- Object
|
||||
- Block
|
||||
- File-level
|
||||
* - Swift
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
-
|
||||
* - LVM
|
||||
-
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
* - Ceph
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- Experimental
|
||||
* - Gluster
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
* - NFS
|
||||
-
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
* - ZFS
|
||||
-
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
* - Sheepdog
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
- .. image:: /figures/Check_mark_23x20_02.png
|
||||
:width: 30%
|
||||
-
|
||||
|
||||
This list of open source file-level shared storage solutions is not
|
||||
exhaustive. Your organization may already have deployed a file-level shared
|
||||
storage solution that you can use.
|
||||
|
||||
.. note::
|
||||
|
||||
**Storage driver support**
|
||||
|
||||
In addition to the open source technologies, there are a number of
|
||||
proprietary solutions that are officially supported by OpenStack Block
|
||||
Storage. You can find a matrix of the functionality provided by all of the
|
||||
supported Block Storage drivers on the `CinderSupportMatrix
|
||||
wiki <https://wiki.openstack.org/wiki/CinderSupportMatrix>`_.
|
||||
|
||||
Also, you need to decide whether you want to support object storage in
|
||||
your cloud. The two common use cases for providing object storage in a
|
||||
compute cloud are to provide:
|
||||
|
||||
* Users with a persistent storage mechanism for objects like images and video.
|
||||
* A scalable, reliable data store for OpenStack virtual machine images.
|
||||
* An API driven S3 compatible object store for application use.
|
||||
|
||||
Selecting storage hardware
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Storage hardware architecture is determined by selecting specific storage
|
||||
architecture. Determine the selection of storage architecture by
|
||||
evaluating possible solutions against the critical factors, the user
|
||||
requirements, technical considerations, and operational considerations.
|
||||
Consider the following factors when selecting storage hardware:
|
||||
|
||||
Cost
|
||||
Storage can be a significant portion of the overall system cost. For
|
||||
an organization that is concerned with vendor support, a commercial
|
||||
storage solution is advisable, although it comes with a higher price
|
||||
tag. If initial capital expenditure requires minimization, designing
|
||||
a system based on commodity hardware would apply. The trade-off is
|
||||
potentially higher support costs and a greater risk of
|
||||
incompatibility and interoperability issues.
|
||||
|
||||
Performance
|
||||
Performance of block based storage is typically measured in the maximum read
|
||||
and write operations to non-contiguous storage locations per second. This
|
||||
measurement typically applies to SAN, hard drives, and solid state drives.
|
||||
While IOPS can be broadly measured and is not an official benchmark, many
|
||||
vectors like to be used by vendors to communicate performance levels. Since
|
||||
there are no real standards for measuring IOPS, vendor test results may vary,
|
||||
sometimes wildly. However, along with transfer rate which measures the speed
|
||||
that data can be transferred to contiguous storage locations, IOPS can be
|
||||
used in a performance evaluation. Typically, transfer rate is represented by
|
||||
a bytes per second calculation but IOPS is measured by an integer.
|
||||
|
||||
To calculate IOPS for a single drive you could use:
|
||||
IOPS = 1 / (AverageLatency + AverageSeekTime)
|
||||
For example:
|
||||
Average Latency for Single Disk = 2.99ms or .00299 seconds
|
||||
Average Seek Time for Single Disk = 4.7ms or .0047 seconds
|
||||
IOPS = 1/(.00299 + .0047)
|
||||
IOPS = 130
|
||||
|
||||
To calculate maximum IOPS for a disk array:
|
||||
Maximum Read IOPS:
|
||||
In order to accurately calculate maximum read IOPS for a disk array,
|
||||
multiply the IOPS for each disk by the maximum read or write IOPS per disk.
|
||||
maxReadIOPS = nDisks * diskMaxIOPS
|
||||
For example, 15 10K Spinning Disks would be measured the following way:
|
||||
maxReadIOPS = 15 * 130 maxReadIOPS = 1950
|
||||
|
||||
Maximum write IOPS per array:
|
||||
Determining the maximum *write* IOPS is a little different because most
|
||||
administrators configure disk replication using RAID and since the RAID
|
||||
controller requires IOPS itself, there is a write penalty. The severity of
|
||||
the write penalty is determined by the type of RAID used.
|
||||
|
||||
=========== ==========
|
||||
Raid Type Penalty
|
||||
----------- ----------
|
||||
1 2
|
||||
5 4
|
||||
10 2
|
||||
=========== ==========
|
||||
|
||||
.. note::
|
||||
|
||||
Raid 5 has the worst penalty (has the most cross disk writes.)
|
||||
Therefore, when using the above examples, a 15 disk array using RAID 5 is
|
||||
capable of 1950 read IOPS however, we need to add the penalty when
|
||||
determining the *write* IOPS:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
maxWriteIOPS = 1950 / 4
|
||||
maxWriteIOPS = 487.5
|
||||
|
||||
A RAID 5 array only has 25% of the write IOPS of the read IOPS while a RAID
|
||||
1 array in this case would produce a maximum of 975 IOPS.
|
||||
|
||||
What about SSD? DRAM SSD?
|
||||
In an HDD, data transfer is sequential. The actual read/write head "seeks" a
|
||||
point in the hard drive to execute the operation. Seek time is significant.
|
||||
Transfer rate can also be influenced by file system fragmentation and the
|
||||
layout. Finally, the mechanical nature of hard disks also has certain
|
||||
performance limitations.
|
||||
|
||||
In an SSD, data transfer is *not* sequential; it is random so it is faster.
|
||||
There is consistent read performance because the physical location of data is
|
||||
irrelevant because SSDs have no read/write heads and thus no delays due to
|
||||
head motion (seeking).
|
||||
|
||||
.. note::
|
||||
|
||||
Some basic benchmarks for small read/writes:
|
||||
|
||||
- **HDDs**: Small reads – 175 IOPs, Small writes – 280 IOPs
|
||||
- **Flash SSDs**: Small reads – 1075 IOPs (6x), Small writes – 21 IOPs (0.1x)
|
||||
- **DRAM SSDs**: Small reads – 4091 IOPs (23x), Small writes – 4184 IOPs
|
||||
(14x)
|
||||
|
||||
Scalability
|
||||
Scalability, along with expandability, is a major consideration in
|
||||
a general purpose OpenStack cloud. It might be difficult to predict the final
|
||||
intended size of the implementation as there are no established usage patterns
|
||||
for a general purpose cloud. It might become necessary to expand the initial
|
||||
deployment in order to accommodate growth and user demand. Many vendors have
|
||||
implemented their own solutions to this problem. Some use clustered file
|
||||
systems that span multiple appliances, while others have similar technologies
|
||||
to allow block storage to scale past a fixed capacity. Ceph, a distributed
|
||||
storage solution that offers block storage, was designed to solve this scale
|
||||
issue and does not have the same limitations on domains, clusters, or scale
|
||||
issues of other appliance driven models.
|
||||
|
||||
Expandability
|
||||
Expandability is a major architecture factor for storage solutions
|
||||
with general purpose OpenStack cloud. A storage solution that
|
||||
expands to 50 PB is considered more expandable than a solution that
|
||||
only scales to 10 PB. This meter is related to scalability, which is
|
||||
the measure of a solution's performance as it expands.
|
||||
|
||||
Implementing Block Storage
|
||||
--------------------------
|
||||
|
||||
Configure Block Storage resource nodes with advanced RAID controllers
|
||||
and high-performance disks to provide fault tolerance at the hardware
|
||||
level.
|
||||
|
||||
We recommend deploying high performing storage solutions such as SSD
|
||||
drives or flash storage systems for applications requiring additional
|
||||
performance out of Block Storage devices.
|
||||
|
||||
In environments that place substantial demands on Block Storage, we
|
||||
recommend using multiple storage pools. In this case, each pool of
|
||||
devices should have a similar hardware design and disk configuration
|
||||
across all hardware nodes in that pool. This allows for a design that
|
||||
provides applications with access to a wide variety of Block Storage pools,
|
||||
each with their own redundancy, availability, and performance
|
||||
characteristics. When deploying multiple pools of storage, it is also
|
||||
important to consider the impact on the Block Storage scheduler which is
|
||||
responsible for provisioning storage across resource nodes. Ideally,
|
||||
ensure that applications can schedule volumes in multiple regions, each with
|
||||
their own network, power, and cooling infrastructure. This will give tenants
|
||||
the option of building fault-tolerant applications that are distributed
|
||||
across multiple availability zones.
|
||||
|
||||
In addition to the Block Storage resource nodes, it is important to
|
||||
design for high availability and redundancy of the APIs, and related
|
||||
services that are responsible for provisioning and providing access to
|
||||
storage. We recommend designing a layer of hardware or software load
|
||||
balancers in order to achieve high availability of the appropriate REST
|
||||
API services to provide uninterrupted service. In some cases, it may
|
||||
also be necessary to deploy an additional layer of load balancing to
|
||||
provide access to back-end database services responsible for servicing
|
||||
and storing the state of Block Storage volumes. It is imperative that a
|
||||
highly available database cluster is used to store the Block Storage metadata.
|
||||
|
||||
In a cloud with significant demands on Block Storage, the network
|
||||
architecture should take into account the amount of East-West bandwidth
|
||||
required for instances to make use of the available storage resources.
|
||||
The selected network devices should support jumbo frames for
|
||||
transferring large blocks of data, and utilize a dedicated network for
|
||||
providing connectivity between instances and Block Storage.
|
||||
|
||||
Implementing Object Storage
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
While consistency and partition tolerance are both inherent features of
|
||||
the Object Storage service, it is important to design the overall
|
||||
storage architecture to ensure that the implemented system meets those goals.
|
||||
The OpenStack Object Storage service places a specific number of
|
||||
data replicas as objects on resource nodes. Replicas are distributed
|
||||
throughout the cluster, based on a consistent hash ring also stored on
|
||||
each node in the cluster.
|
||||
|
||||
When designing your cluster, you must consider durability and
|
||||
availability which is dependent on the spread and placement of your data,
|
||||
rather than the reliability of the hardware.
|
||||
|
||||
Consider the default value of the number of replicas, which is three. This
|
||||
means that before an object is marked as having been written, at least two
|
||||
copies exist in case a single server fails to write, the third copy may or
|
||||
may not yet exist when the write operation initially returns. Altering this
|
||||
number increases the robustness of your data, but reduces the amount of
|
||||
storage you have available. Look at the placement of your servers. Consider
|
||||
spreading them widely throughout your data center's network and power-failure
|
||||
zones. Is a zone a rack, a server, or a disk?
|
||||
|
||||
Consider these main traffic flows for an Object Storage network:
|
||||
|
||||
* Among :term:`object`, :term:`container`, and
|
||||
:term:`account servers <account server>`
|
||||
* Between servers and the proxies
|
||||
* Between the proxies and your users
|
||||
|
||||
Object Storage frequent communicates among servers hosting data. Even a small
|
||||
cluster generates megabytes per second of traffic.
|
||||
|
||||
Consider the scenario where an entire server fails and 24 TB of data
|
||||
needs to be transferred "immediately" to remain at three copies — this can
|
||||
put significant load on the network.
|
||||
|
||||
Another consideration is when a new file is being uploaded, the proxy server
|
||||
must write out as many streams as there are replicas, multiplying network
|
||||
traffic. For a three-replica cluster, 10 Gbps in means 30 Gbps out. Combining
|
||||
this with the previous high bandwidth bandwidth private versus public network
|
||||
recommendations demands of replication is what results in the recommendation
|
||||
that your private network be of significantly higher bandwidth than your public
|
||||
network requires. OpenStack Object Storage communicates internally with
|
||||
unencrypted, unauthenticated rsync for performance, so the private
|
||||
network is required.
|
||||
|
||||
The remaining point on bandwidth is the public-facing portion. The
|
||||
``swift-proxy`` service is stateless, which means that you can easily
|
||||
add more and use HTTP load-balancing methods to share bandwidth and
|
||||
availability between them. More proxies means more bandwidth.
|
||||
|
||||
You should consider designing the Object Storage system with a sufficient
|
||||
number of zones to provide quorum for the number of replicas defined. For
|
||||
example, with three replicas configured in the swift cluster, the recommended
|
||||
number of zones to configure within the Object Storage cluster in order to
|
||||
achieve quorum is five. While it is possible to deploy a solution with
|
||||
fewer zones, the implied risk of doing so is that some data may not be
|
||||
available and API requests to certain objects stored in the cluster
|
||||
might fail. For this reason, ensure you properly account for the number
|
||||
of zones in the Object Storage cluster.
|
||||
|
||||
Each Object Storage zone should be self-contained within its own
|
||||
availability zone. Each availability zone should have independent access
|
||||
to network, power, and cooling infrastructure to ensure uninterrupted
|
||||
access to data. In addition, a pool of Object Storage proxy servers
|
||||
providing access to data stored on the object nodes should service each
|
||||
availability zone. Object proxies in each region should leverage local
|
||||
read and write affinity so that local storage resources facilitate
|
||||
access to objects wherever possible. We recommend deploying upstream
|
||||
load balancing to ensure that proxy services are distributed across the
|
||||
multiple zones and, in some cases, it may be necessary to make use of
|
||||
third-party solutions to aid with geographical distribution of services.
|
||||
|
||||
A zone within an Object Storage cluster is a logical division. Any of
|
||||
the following may represent a zone:
|
||||
|
||||
* A disk within a single node
|
||||
* One zone per node
|
||||
* Zone per collection of nodes
|
||||
* Multiple racks
|
||||
* Multiple data centers
|
||||
|
||||
Selecting the proper zone design is crucial for allowing the Object
|
||||
Storage cluster to scale while providing an available and redundant
|
||||
storage system. It may be necessary to configure storage policies that
|
||||
have different requirements with regards to replicas, retention, and
|
||||
other factors that could heavily affect the design of storage in a
|
||||
specific zone.
|
||||
|
||||
Planning and scaling storage capacity
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
An important consideration in running a cloud over time is projecting growth
|
||||
and utilization trends in order to plan capital expenditures for the short and
|
||||
long term. Gather utilization meters for compute, network, and storage, along
|
||||
with historical records of these meters. While securing major anchor tenants
|
||||
can lead to rapid jumps in the utilization of resources, the average rate of
|
||||
adoption of cloud services through normal usage also needs to be carefully
|
||||
monitored.
|
||||
|
||||
Scaling Block Storage
|
||||
---------------------
|
||||
|
||||
You can upgrade Block Storage pools to add storage capacity without
|
||||
interrupting the overall Block Storage service. Add nodes to the pool by
|
||||
installing and configuring the appropriate hardware and software and
|
||||
then allowing that node to report in to the proper storage pool through the
|
||||
message bus. Block Storage nodes generally report into the scheduler
|
||||
service advertising their availability. As a result, after the node is
|
||||
online and available, tenants can make use of those storage resources
|
||||
instantly.
|
||||
|
||||
In some cases, the demand on Block Storage may exhaust the available
|
||||
network bandwidth. As a result, design network infrastructure that
|
||||
services Block Storage resources in such a way that you can add capacity
|
||||
and bandwidth easily. This often involves the use of dynamic routing
|
||||
protocols or advanced networking solutions to add capacity to downstream
|
||||
devices easily. Both the front-end and back-end storage network designs
|
||||
should encompass the ability to quickly and easily add capacity and
|
||||
bandwidth.
|
||||
|
||||
.. note::
|
||||
|
||||
Sufficient monitoring and data collection should be in-place
|
||||
from the start, such that timely decisions regarding capacity,
|
||||
input/output metrics (IOPS) or storage-associated bandwidth can
|
||||
be made.
|
||||
|
||||
Scaling Object Storage
|
||||
----------------------
|
||||
|
||||
Adding back-end storage capacity to an Object Storage cluster requires
|
||||
careful planning and forethought. In the design phase, it is important
|
||||
to determine the maximum partition power required by the Object Storage
|
||||
service, which determines the maximum number of partitions which can
|
||||
exist. Object Storage distributes data among all available storage, but
|
||||
a partition cannot span more than one disk, so the maximum number of
|
||||
partitions can only be as high as the number of disks.
|
||||
|
||||
For example, a system that starts with a single disk and a partition
|
||||
power of 3 can have 8 (2^3) partitions. Adding a second disk means that
|
||||
each has 4 partitions. The one-disk-per-partition limit means that this
|
||||
system can never have more than 8 disks, limiting its scalability.
|
||||
However, a system that starts with a single disk and a partition power
|
||||
of 10 can have up to 1024 (2^10) disks.
|
||||
|
||||
As you add back-end storage capacity to the system, the partition maps
|
||||
redistribute data amongst the storage nodes. In some cases, this
|
||||
involves replication of extremely large data sets. In these cases, we
|
||||
recommend using back-end replication links that do not contend with
|
||||
tenants' access to data.
|
||||
|
||||
As more tenants begin to access data within the cluster and their data
|
||||
sets grow, it is necessary to add front-end bandwidth to service data
|
||||
access requests. Adding front-end bandwidth to an Object Storage cluster
|
||||
requires careful planning and design of the Object Storage proxies that
|
||||
tenants use to gain access to the data, along with the high availability
|
||||
solutions that enable easy scaling of the proxy layer. We recommend
|
||||
designing a front-end load balancing layer that tenants and consumers
|
||||
use to gain access to data stored within the cluster. This load
|
||||
balancing layer may be distributed across zones, regions or even across
|
||||
geographic boundaries, which may also require that the design encompass
|
||||
geo-location solutions.
|
||||
|
||||
In some cases, you must add bandwidth and capacity to the network
|
||||
resources servicing requests between proxy servers and storage nodes.
|
||||
For this reason, the network architecture used for access to storage
|
||||
nodes and proxy servers should make use of a design which is scalable.
|
||||
|
||||
|
||||
Redundancy
|
||||
----------
|
||||
|
||||
When making swift more redundant, one approach is to add additional proxy
|
||||
servers and load balancing. HAProxy is one method of providing load
|
||||
balancing and high availability and is often combined with keepalived
|
||||
or pacemaker to ensure the HAProxy service maintains a stable VIP.
|
||||
Sample HAProxy configurations can be found in the `OpenStack HA Guide.
|
||||
<https://docs.openstack.org/ha-guide/>`_.
|
||||
|
||||
Replication
|
||||
-----------
|
||||
|
||||
Replicas in Object Storage function independently, and clients only
|
||||
require a majority of nodes to respond to a request in order for an
|
||||
operation to be considered successful. Thus, transient failures like
|
||||
network partitions can quickly cause replicas to diverge.
|
||||
Fix These differences are eventually reconciled by
|
||||
asynchronous, peer-to-peer replicator processes. The replicator processes
|
||||
traverse their local filesystems, concurrently performing operations in a
|
||||
manner that balances load across physical disks.
|
||||
|
||||
Replication uses a push model, with records and files generally only being
|
||||
copied from local to remote replicas. This is important because data on the
|
||||
node may not belong there (as in the case of handoffs and ring changes), and a
|
||||
replicator can not know what data exists elsewhere in the cluster that it
|
||||
should pull in. It is the duty of any node that contains data to ensure that
|
||||
data gets to where it belongs. Replica placement is handled by the ring.
|
||||
|
||||
Every deleted record or file in the system is marked by a tombstone, so that
|
||||
deletions can be replicated alongside creations. The replication process cleans
|
||||
up tombstones after a time period known as the consistency window. The
|
||||
consistency window encompasses replication duration and the length of time a
|
||||
transient failure can remove a node from the cluster. Tombstone cleanup must be
|
||||
tied to replication to reach replica convergence.
|
||||
|
||||
If a replicator detects that a remote drive has failed, the replicator uses the
|
||||
``get_more_nodes`` interface for the ring to choose an alternative node with
|
||||
which to synchronize. The replicator can maintain desired levels of replication
|
||||
in the face of disk failures, though some replicas may not be in an immediately
|
||||
usable location.
|
||||
|
||||
.. note::
|
||||
|
||||
The replicator does not maintain desired levels of replication when other
|
||||
failures occur, such as entire node failures, because most failures are
|
||||
transient.
|
||||
|
||||
Replication is an area of active development, andimplementation details
|
||||
are likely to change over time.
|
||||
|
||||
There are two major classes of replicator: the db replicator, which replicates
|
||||
accounts and containers, and the object replicator, which replicates object
|
||||
data.
|
||||
|
||||
For more information, please see the `Swift replication page <https://docs.openstack.org/swift/latest/overview_replication.html>`_.
|
@ -1,329 +0,0 @@
|
||||
================
|
||||
Storage concepts
|
||||
================
|
||||
|
||||
Storage is found in many parts of the OpenStack cloud environment. It is
|
||||
important to understand the distinction between
|
||||
:term:`ephemeral <ephemeral volume>` storage and
|
||||
:term:`persistent <persistent volume>` storage:
|
||||
|
||||
- Ephemeral storage - If you only deploy OpenStack
|
||||
:term:`Compute service (nova)`, by default your users do not have access to
|
||||
any form of persistent storage. The disks associated with VMs are ephemeral,
|
||||
meaning that from the user's point of view they disappear when a virtual
|
||||
machine is terminated.
|
||||
|
||||
- Persistent storage - Persistent storage means that the storage resource
|
||||
outlives any other resource and is always available, regardless of the state
|
||||
of a running instance.
|
||||
|
||||
OpenStack clouds explicitly support three types of persistent
|
||||
storage: *Object Storage*, *Block Storage*, and *File-based storage*.
|
||||
|
||||
Object storage
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
Object storage is implemented in OpenStack by the
|
||||
Object Storage service (swift). Users access binary objects through a REST API.
|
||||
If your intended users need to archive or manage large datasets, you should
|
||||
provide them with Object Storage service. Additional benefits include:
|
||||
|
||||
- OpenStack can store your virtual machine (VM) images inside of an Object
|
||||
Storage system, as an alternative to storing the images on a file system.
|
||||
- Integration with OpenStack Identity, and works with the OpenStack Dashboard.
|
||||
- Better support for distributed deployments across multiple datacenters
|
||||
through support for asynchronous eventual consistency replication.
|
||||
|
||||
You should consider using the OpenStack Object Storage service if you eventually
|
||||
plan on distributing your storage cluster across multiple data centers, if you
|
||||
need unified accounts for your users for both compute and object storage, or if
|
||||
you want to control your object storage with the OpenStack Dashboard. For more
|
||||
information, see the `Swift project page <https://www.openstack.org/software/releases/ocata/components/swift>`_.
|
||||
|
||||
Block storage
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
Block storage is implemented in OpenStack by the
|
||||
Block Storage service (cinder). Because these volumes are
|
||||
persistent, they can be detached from one instance and re-attached to another
|
||||
instance and the data remains intact.
|
||||
|
||||
The Block Storage service supports multiple back ends in the form of drivers.
|
||||
Your choice of a storage back end must be supported by a block storage
|
||||
driver.
|
||||
|
||||
Most block storage drivers allow the instance to have direct access to
|
||||
the underlying storage hardware's block device. This helps increase the
|
||||
overall read/write IO. However, support for utilizing files as volumes
|
||||
is also well established, with full support for NFS, GlusterFS and
|
||||
others.
|
||||
|
||||
These drivers work a little differently than a traditional block
|
||||
storage driver. On an NFS or GlusterFS file system, a single file is
|
||||
created and then mapped as a virtual volume into the instance. This
|
||||
mapping and translation is similar to how OpenStack utilizes QEMU's
|
||||
file-based virtual machines stored in ``/var/lib/nova/instances``.
|
||||
|
||||
File-based storage
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In multi-tenant OpenStack cloud environment, the Shared File Systems service
|
||||
(manila) provides a set of services for management of shared file systems. The
|
||||
Shared File Systems service supports multiple back-ends in the form of drivers,
|
||||
and can be configured to provision shares from one or more back-ends. Share
|
||||
servers are virtual machines that export file shares using different file
|
||||
system protocols such as NFS, CIFS, GlusterFS, or HDFS.
|
||||
|
||||
The Shared File Systems service is persistent storage and can be mounted to any
|
||||
number of client machines. It can also be detached from one instance and
|
||||
attached to another instance without data loss. During this process the data
|
||||
are safe unless the Shared File Systems service itself is changed or removed.
|
||||
|
||||
Users interact with the Shared File Systems service by mounting remote file
|
||||
systems on their instances with the following usage of those systems for
|
||||
file storing and exchange. The Shared File Systems service provides shares
|
||||
which is a remote, mountable file system. You can mount a share and access a
|
||||
share from several hosts by several users at a time. With shares, you can also:
|
||||
|
||||
* Create a share specifying its size, shared file system protocol,
|
||||
visibility level.
|
||||
* Create a share on either a share server or standalone, depending on
|
||||
the selected back-end mode, with or without using a share network.
|
||||
* Specify access rules and security services for existing shares.
|
||||
* Combine several shares in groups to keep data consistency inside the
|
||||
groups for the following safe group operations.
|
||||
* Create a snapshot of a selected share or a share group for storing
|
||||
the existing shares consistently or creating new shares from that
|
||||
snapshot in a consistent way.
|
||||
* Create a share from a snapshot.
|
||||
* Set rate limits and quotas for specific shares and snapshots.
|
||||
* View usage of share resources.
|
||||
* Remove shares.
|
||||
|
||||
Differences between storage types
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
:ref:`table_openstack_storage` explains the differences between Openstack
|
||||
storage types.
|
||||
|
||||
.. _table_openstack_storage:
|
||||
|
||||
.. list-table:: Table. OpenStack storage
|
||||
:widths: 20 20 20 20 20
|
||||
:header-rows: 1
|
||||
|
||||
* -
|
||||
- Ephemeral storage
|
||||
- Block storage
|
||||
- Object storage
|
||||
- Shared File System storage
|
||||
* - Application
|
||||
- Run operating system and scratch space
|
||||
- Add additional persistent storage to a virtual machine (VM)
|
||||
- Store data, including VM images
|
||||
- Add additional persistent storage to a virtual machine
|
||||
* - Accessed through…
|
||||
- A file system
|
||||
- A block device that can be partitioned, formatted, and mounted
|
||||
(such as, /dev/vdc)
|
||||
- The REST API
|
||||
- A Shared File Systems service share (either manila managed or an
|
||||
external one registered in manila) that can be partitioned, formatted
|
||||
and mounted (such as /dev/vdc)
|
||||
* - Accessible from…
|
||||
- Within a VM
|
||||
- Within a VM
|
||||
- Anywhere
|
||||
- Within a VM
|
||||
* - Managed by…
|
||||
- OpenStack Compute (nova)
|
||||
- OpenStack Block Storage (cinder)
|
||||
- OpenStack Object Storage (swift)
|
||||
- OpenStack Shared File System Storage (manila)
|
||||
* - Persists until…
|
||||
- VM is terminated
|
||||
- Deleted by user
|
||||
- Deleted by user
|
||||
- Deleted by user
|
||||
* - Sizing determined by…
|
||||
- Administrator configuration of size settings, known as *flavors*
|
||||
- User specification in initial request
|
||||
- Amount of available physical storage
|
||||
- * User specification in initial request
|
||||
* Requests for extension
|
||||
* Available user-level quotes
|
||||
* Limitations applied by Administrator
|
||||
* - Encryption configuration
|
||||
- Parameter in ``nova.conf``
|
||||
- Admin establishing `encrypted volume type
|
||||
<https://docs.openstack.org/admin-guide/dashboard-manage-volumes.html>`_,
|
||||
then user selecting encrypted volume
|
||||
- Not yet available
|
||||
- Shared File Systems service does not apply any additional encryption
|
||||
above what the share’s back-end storage provides
|
||||
* - Example of typical usage…
|
||||
- 10 GB first disk, 30 GB second disk
|
||||
- 1 TB disk
|
||||
- 10s of TBs of dataset storage
|
||||
- Depends completely on the size of back-end storage specified when
|
||||
a share was being created. In case of thin provisioning it can be
|
||||
partial space reservation (for more details see
|
||||
`Capabilities and Extra-Specs
|
||||
<https://docs.openstack.org/manila/latest/contributor/capabilities_and_extra_specs.html#common-capabilities>`_
|
||||
specification)
|
||||
|
||||
.. note::
|
||||
|
||||
**File-level storage for live migration**
|
||||
|
||||
With file-level storage, users access stored data using the operating
|
||||
system's file system interface. Most users who have used a network
|
||||
storage solution before have encountered this form of networked
|
||||
storage. The most common file system protocol for Unix is NFS, and for
|
||||
Windows, CIFS (previously, SMB).
|
||||
|
||||
OpenStack clouds do not present file-level storage to end users.
|
||||
However, it is important to consider file-level storage for storing
|
||||
instances under ``/var/lib/nova/instances`` when designing your cloud,
|
||||
since you must have a shared file system if you want to support live
|
||||
migration.
|
||||
|
||||
Commodity storage technologies
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
There are various commodity storage back end technologies available. Depending
|
||||
on your cloud user's needs, you can implement one or many of these technologies
|
||||
in different combinations.
|
||||
|
||||
Ceph
|
||||
----
|
||||
|
||||
Ceph is a scalable storage solution that replicates data across commodity
|
||||
storage nodes.
|
||||
|
||||
Ceph utilises and object storage mechanism for data storage and exposes
|
||||
the data via different types of storage interfaces to the end user it
|
||||
supports interfaces for:
|
||||
- Object storage
|
||||
- Block storage
|
||||
- File-system interfaces
|
||||
|
||||
Ceph provides support for the same Object Storage API as swift and can
|
||||
be used as a back end for the Block Storage service (cinder) as well as
|
||||
back-end storage for glance images.
|
||||
|
||||
Ceph supports thin provisioning implemented using copy-on-write. This can
|
||||
be useful when booting from volume because a new volume can be provisioned
|
||||
very quickly. Ceph also supports keystone-based authentication (as of
|
||||
version 0.56), so it can be a seamless swap in for the default OpenStack
|
||||
swift implementation.
|
||||
|
||||
Ceph's advantages include:
|
||||
|
||||
- The administrator has more fine-grained control over data distribution and
|
||||
replication strategies.
|
||||
- Consolidation of object storage and block storage.
|
||||
- Fast provisioning of boot-from-volume instances using thin provisioning.
|
||||
- Support for the distributed file-system interface
|
||||
`CephFS <http://ceph.com/docs/master/cephfs/>`_.
|
||||
|
||||
You should consider Ceph if you want to manage your object and block storage
|
||||
within a single system, or if you want to support fast boot-from-volume.
|
||||
|
||||
Gluster
|
||||
-------
|
||||
|
||||
A distributed shared file system. As of Gluster version 3.3, you
|
||||
can use Gluster to consolidate your object storage and file storage
|
||||
into one unified file and object storage solution, which is called
|
||||
Gluster For OpenStack (GFO). GFO uses a customized version of swift
|
||||
that enables Gluster to be used as the back-end storage.
|
||||
|
||||
The main reason to use GFO rather than swift is if you also
|
||||
want to support a distributed file system, either to support shared
|
||||
storage live migration or to provide it as a separate service to
|
||||
your end users. If you want to manage your object and file storage
|
||||
within a single system, you should consider GFO.
|
||||
|
||||
LVM
|
||||
---
|
||||
|
||||
The Logical Volume Manager (LVM) is a Linux-based system that provides an
|
||||
abstraction layer on top of physical disks to expose logical volumes
|
||||
to the operating system. The LVM back-end implements block storage
|
||||
as LVM logical partitions.
|
||||
|
||||
On each host that will house block storage, an administrator must
|
||||
initially create a volume group dedicated to Block Storage volumes.
|
||||
Blocks are created from LVM logical volumes.
|
||||
|
||||
.. note::
|
||||
|
||||
LVM does *not* provide any replication. Typically,
|
||||
administrators configure RAID on nodes that use LVM as block
|
||||
storage to protect against failures of individual hard drives.
|
||||
However, RAID does not protect against a failure of the entire
|
||||
host.
|
||||
|
||||
iSCSI
|
||||
-----
|
||||
|
||||
Internet Small Computer Systems Interface (iSCSI) is a network protocol that
|
||||
operates on top of the Transport Control Protocol (TCP) for linking data
|
||||
storage devices. It transports data between an iSCSI initiator on a server
|
||||
and iSCSI target on a storage device.
|
||||
|
||||
iSCSI is suitable for cloud environments with Block Storage service to support
|
||||
applications or for file sharing systems. Network connectivity can be
|
||||
achieved at a lower cost compared to other storage back end technologies since
|
||||
iSCSI does not require host bus adaptors (HBA) or storage-specific network
|
||||
devices.
|
||||
|
||||
.. Add tips? iSCSI traffic on a separate network or virtual vLAN?
|
||||
|
||||
NFS
|
||||
---
|
||||
|
||||
Network File System (NFS) is a file system protocol that allows a user or
|
||||
administrator to mount a file system on a server. File clients can access
|
||||
mounted file systems through Remote Procedure Calls (RPC).
|
||||
|
||||
The benefits of NFS is low implementation cost due to shared NICs and
|
||||
traditional network components, and a simpler configuration and setup process.
|
||||
|
||||
For more information on configuring Block Storage to use NFS storage, see
|
||||
`Configure an NFS storage back end
|
||||
<https://docs.openstack.org/admin-guide/blockstorage-nfs-backend.html>`_ in the
|
||||
OpenStack Administrator Guide.
|
||||
|
||||
Sheepdog
|
||||
--------
|
||||
|
||||
Sheepdog is a userspace distributed storage system. Sheepdog scales
|
||||
to several hundred nodes, and has powerful virtual disk management
|
||||
features like snapshot, cloning, rollback and thin provisioning.
|
||||
|
||||
It is essentially an object storage system that manages disks and
|
||||
aggregates the space and performance of disks linearly in hyper
|
||||
scale on commodity hardware in a smart way. On top of its object store,
|
||||
Sheepdog provides elastic volume service and http service.
|
||||
Sheepdog does require a specific kernel version and can work
|
||||
nicely with xattr-supported file systems.
|
||||
|
||||
ZFS
|
||||
---
|
||||
|
||||
The Solaris iSCSI driver for OpenStack Block Storage implements
|
||||
blocks as ZFS entities. ZFS is a file system that also has the
|
||||
functionality of a volume manager. This is unlike on a Linux system,
|
||||
where there is a separation of volume manager (LVM) and file system
|
||||
(such as, ext3, ext4, xfs, and btrfs). ZFS has a number of
|
||||
advantages over ext4, including improved data-integrity checking.
|
||||
|
||||
The ZFS back end for OpenStack Block Storage supports only
|
||||
Solaris-based systems, such as Illumos. While there is a Linux port
|
||||
of ZFS, it is not included in any of the standard Linux
|
||||
distributions, and it has not been tested with OpenStack Block
|
||||
Storage. As with LVM, ZFS does not provide replication across hosts
|
||||
on its own, you need to add a replication solution on top of ZFS if
|
||||
your cloud needs to be able to handle storage-node failures.
|
@ -1,50 +0,0 @@
|
||||
.. _design:
|
||||
|
||||
======
|
||||
Design
|
||||
======
|
||||
|
||||
Designing an OpenStack cloud requires a understanding of the cloud user's
|
||||
requirements and needs to determine the best possible configuration. This
|
||||
chapter provides guidance on the decisions you need to make during the
|
||||
design process.
|
||||
|
||||
To design, deploy, and configure OpenStack, administrators must
|
||||
understand the logical architecture. OpenStack modules are one of the
|
||||
following types:
|
||||
|
||||
Daemon
|
||||
Runs as a background process. On Linux platforms, a daemon is usually
|
||||
installed as a service.
|
||||
|
||||
Script
|
||||
Installs a virtual environment and runs tests.
|
||||
|
||||
Command-line interface (CLI)
|
||||
Enables users to submit API calls to OpenStack services through commands.
|
||||
|
||||
:ref:`logical_architecture` shows one example of the most common
|
||||
integrated services within OpenStack and how they interact with each
|
||||
other. End users can interact through the dashboard, CLIs, and APIs.
|
||||
All services authenticate through a common Identity service, and
|
||||
individual services interact with each other through public APIs, except
|
||||
where privileged administrator commands are necessary.
|
||||
|
||||
.. _logical_architecture:
|
||||
|
||||
.. figure:: common/figures/osog_0001.png
|
||||
:width: 100%
|
||||
:alt: OpenStack Logical Architecture
|
||||
|
||||
OpenStack Logical Architecture
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
design-compute.rst
|
||||
design-storage.rst
|
||||
design-networking.rst
|
||||
design-identity.rst
|
||||
design-images.rst
|
||||
design-control-plane.rst
|
||||
design-cmp-tools.rst
|
Before Width: | Height: | Size: 3.0 KiB |
Before Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 39 KiB |
Before Width: | Height: | Size: 35 KiB |
Before Width: | Height: | Size: 9.1 KiB |
Before Width: | Height: | Size: 16 KiB |
Before Width: | Height: | Size: 79 KiB |
Before Width: | Height: | Size: 70 KiB |
Before Width: | Height: | Size: 24 KiB |
Before Width: | Height: | Size: 42 KiB |
Before Width: | Height: | Size: 59 KiB |
Before Width: | Height: | Size: 54 KiB |
Before Width: | Height: | Size: 54 KiB |
Before Width: | Height: | Size: 68 KiB |
Before Width: | Height: | Size: 50 KiB |
Before Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 75 KiB |
Before Width: | Height: | Size: 37 KiB |
Before Width: | Height: | Size: 56 KiB |
Before Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 56 KiB |
Before Width: | Height: | Size: 30 KiB |
Before Width: | Height: | Size: 22 KiB |
Before Width: | Height: | Size: 25 KiB |
Before Width: | Height: | Size: 50 KiB |
Before Width: | Height: | Size: 50 KiB |
Before Width: | Height: | Size: 35 KiB |
Before Width: | Height: | Size: 42 KiB |
@ -1,52 +0,0 @@
|
||||
.. meta::
|
||||
:description: This guide targets OpenStack Architects
|
||||
for architectural design
|
||||
:keywords: Architecture, OpenStack
|
||||
|
||||
===================================
|
||||
OpenStack Architecture Design Guide
|
||||
===================================
|
||||
|
||||
.. important::
|
||||
|
||||
**This guide is no longer maintained by the OpenStack documentation
|
||||
team. If you wish to update this guide, propose a patch at your
|
||||
own leisure.**
|
||||
|
||||
This guide was last updated as of the Pike release, documenting
|
||||
the OpenStack Ocata, Newton, and Mitaka releases. It may
|
||||
not apply to EOL releases Kilo and Liberty.
|
||||
|
||||
We advise that you read this at your own discretion when planning
|
||||
on your OpenStack cloud. This guide is intended as advice only.
|
||||
|
||||
Abstract
|
||||
~~~~~~~~
|
||||
|
||||
The Architecture Design Guide provides information on planning and designing
|
||||
an OpenStack cloud. It explains core concepts, cloud architecture design
|
||||
requirements, and the design criteria of key components and services in an
|
||||
OpenStack cloud. The guide also describes five common cloud use cases.
|
||||
|
||||
Before reading this book, we recommend:
|
||||
|
||||
* Prior knowledge of cloud architecture and principles.
|
||||
* Linux and virtualization experience.
|
||||
* A basic understanding of networking principles and protocols.
|
||||
|
||||
For information about deploying and operating OpenStack, see the
|
||||
`Installation Guides <https://docs.openstack.org/ocata/install/>`_,
|
||||
`Deployment Guides <https://docs.openstack.org/ocata/deploy/>`_,
|
||||
and the `OpenStack Operations Guide <https://docs.openstack.org/operations-guide/>`_.
|
||||
|
||||
Contents
|
||||
~~~~~~~~
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
common/conventions.rst
|
||||
arch-requirements.rst
|
||||
design.rst
|
||||
use-cases.rst
|
||||
common/appendix.rst
|
@ -1,14 +0,0 @@
|
||||
.. _use-cases:
|
||||
|
||||
=========
|
||||
Use cases
|
||||
=========
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
use-cases/use-case-development
|
||||
use-cases/use-case-general-compute
|
||||
use-cases/use-case-web-scale
|
||||
use-cases/use-case-storage
|
||||
use-cases/use-case-nfv
|
@ -1,14 +0,0 @@
|
||||
.. _development-cloud:
|
||||
|
||||
=================
|
||||
Development cloud
|
||||
=================
|
||||
|
||||
Design model
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
@ -1,196 +0,0 @@
|
||||
.. _general-compute-cloud:
|
||||
|
||||
=====================
|
||||
General compute cloud
|
||||
=====================
|
||||
|
||||
Design model
|
||||
~~~~~~~~~~~~
|
||||
|
||||
An online classified advertising company wants to run web applications
|
||||
consisting of Tomcat, Nginx, and MariaDB in a private cloud. To meet the
|
||||
policy requirements, the cloud infrastructure will run in their
|
||||
own data center. The company has predictable load requirements but
|
||||
requires scaling to cope with nightly increases in demand. Their current
|
||||
environment does not have the flexibility to align with their goal of
|
||||
running an open source API environment. The current environment consists
|
||||
of the following:
|
||||
|
||||
* Between 120 and 140 installations of Nginx and Tomcat, each with 2
|
||||
vCPUs and 4 GB of RAM
|
||||
|
||||
* A three node MariaDB and Galera cluster, each with 4 vCPUs and 8 GB
|
||||
of RAM
|
||||
|
||||
The company runs hardware load balancers and multiple web applications
|
||||
serving their websites and orchestrates environments using combinations
|
||||
of scripts and Puppet. The website generates large amounts of log data
|
||||
daily that requires archiving.
|
||||
|
||||
The solution would consist of the following OpenStack components:
|
||||
|
||||
* A firewall, switches and load balancers on the public facing network
|
||||
connections.
|
||||
|
||||
* OpenStack Controller service running Image service, Identity service,
|
||||
Networking service, combined with support services such as MariaDB and
|
||||
RabbitMQ, configured for high availability on at least three controller
|
||||
nodes.
|
||||
|
||||
* OpenStack compute nodes running the KVM hypervisor.
|
||||
|
||||
* OpenStack Block Storage for use by compute instances, requiring
|
||||
persistent storage (such as databases for dynamic sites).
|
||||
|
||||
* OpenStack Object Storage for serving static objects (such as images).
|
||||
|
||||
.. figure:: ../figures/General_Architecture3.png
|
||||
|
||||
Running up to 140 web instances and the small number of MariaDB
|
||||
instances requires 292 vCPUs available, as well as 584 GB of RAM. On a
|
||||
typical 1U server using dual-socket hex-core Intel CPUs with
|
||||
Hyperthreading, and assuming 2:1 CPU overcommit ratio, this would
|
||||
require 8 OpenStack Compute nodes.
|
||||
|
||||
The web application instances run from local storage on each of the
|
||||
OpenStack Compute nodes. The web application instances are stateless,
|
||||
meaning that any of the instances can fail and the application will
|
||||
continue to function.
|
||||
|
||||
MariaDB server instances store their data on shared enterprise storage,
|
||||
such as NetApp or Solidfire devices. If a MariaDB instance fails,
|
||||
storage would be expected to be re-attached to another instance and
|
||||
rejoined to the Galera cluster.
|
||||
|
||||
Logs from the web application servers are shipped to OpenStack Object
|
||||
Storage for processing and archiving.
|
||||
|
||||
Additional capabilities can be realized by moving static web content to
|
||||
be served from OpenStack Object Storage containers, and backing the
|
||||
OpenStack Image service with OpenStack Object Storage.
|
||||
|
||||
.. note::
|
||||
|
||||
Increasing OpenStack Object Storage means network bandwidth needs to
|
||||
be taken into consideration. Running OpenStack Object Storage with
|
||||
network connections offering 10 GbE or better connectivity is
|
||||
advised.
|
||||
|
||||
Leveraging Orchestration and Telemetry services is also a potential
|
||||
issue when providing auto-scaling, orchestrated web application
|
||||
environments. Defining the web applications in a
|
||||
:term:`Heat Orchestration Template (HOT)`
|
||||
negates the reliance on the current scripted Puppet
|
||||
solution.
|
||||
|
||||
OpenStack Networking can be used to control hardware load balancers
|
||||
through the use of plug-ins and the Networking API. This allows users to
|
||||
control hardware load balance pools and instances as members in these
|
||||
pools, but their use in production environments must be carefully
|
||||
weighed against current stability.
|
||||
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
.. temporarily location of storage information until we establish a template
|
||||
|
||||
Storage requirements
|
||||
--------------------
|
||||
Using a scale-out storage solution with direct-attached storage (DAS) in
|
||||
the servers is well suited for a general purpose OpenStack cloud. Cloud
|
||||
services requirements determine your choice of scale-out solution. You
|
||||
need to determine if a single, highly expandable and highly vertical,
|
||||
scalable, centralized storage array is suitable for your design. After
|
||||
determining an approach, select the storage hardware based on this
|
||||
criteria.
|
||||
|
||||
This list expands upon the potential impacts for including a particular
|
||||
storage architecture (and corresponding storage hardware) into the
|
||||
design for a general purpose OpenStack cloud:
|
||||
|
||||
Connectivity
|
||||
If storage protocols other than Ethernet are part of the storage solution,
|
||||
ensure the appropriate hardware has been selected. If a centralized storage
|
||||
array is selected, ensure that the hypervisor will be able to connect to
|
||||
that storage array for image storage.
|
||||
|
||||
Usage
|
||||
How the particular storage architecture will be used is critical for
|
||||
determining the architecture. Some of the configurations that will
|
||||
influence the architecture include whether it will be used by the
|
||||
hypervisors for ephemeral instance storage, or if OpenStack Object
|
||||
Storage will use it for object storage.
|
||||
|
||||
Instance and image locations
|
||||
Where instances and images will be stored will influence the
|
||||
architecture.
|
||||
|
||||
Server hardware
|
||||
If the solution is a scale-out storage architecture that includes
|
||||
DAS, it will affect the server hardware selection. This could ripple
|
||||
into the decisions that affect host density, instance density, power
|
||||
density, OS-hypervisor, management tools and others.
|
||||
|
||||
A general purpose OpenStack cloud has multiple options. The key factors
|
||||
that will have an influence on selection of storage hardware for a
|
||||
general purpose OpenStack cloud are as follows:
|
||||
|
||||
Capacity
|
||||
Hardware resources selected for the resource nodes should be capable
|
||||
of supporting enough storage for the cloud services. Defining the
|
||||
initial requirements and ensuring the design can support adding
|
||||
capacity is important. Hardware nodes selected for object storage
|
||||
should be capable of support a large number of inexpensive disks
|
||||
with no reliance on RAID controller cards. Hardware nodes selected
|
||||
for block storage should be capable of supporting high speed storage
|
||||
solutions and RAID controller cards to provide performance and
|
||||
redundancy to storage at a hardware level. Selecting hardware RAID
|
||||
controllers that automatically repair damaged arrays will assist
|
||||
with the replacement and repair of degraded or deleted storage
|
||||
devices.
|
||||
|
||||
Performance
|
||||
Disks selected for object storage services do not need to be fast
|
||||
performing disks. We recommend that object storage nodes take
|
||||
advantage of the best cost per terabyte available for storage.
|
||||
Contrastingly, disks chosen for block storage services should take
|
||||
advantage of performance boosting features that may entail the use
|
||||
of SSDs or flash storage to provide high performance block storage
|
||||
pools. Storage performance of ephemeral disks used for instances
|
||||
should also be taken into consideration.
|
||||
|
||||
Fault tolerance
|
||||
Object storage resource nodes have no requirements for hardware
|
||||
fault tolerance or RAID controllers. It is not necessary to plan for
|
||||
fault tolerance within the object storage hardware because the
|
||||
object storage service provides replication between zones as a
|
||||
feature of the service. Block storage nodes, compute nodes, and
|
||||
cloud controllers should all have fault tolerance built in at the
|
||||
hardware level by making use of hardware RAID controllers and
|
||||
varying levels of RAID configuration. The level of RAID chosen
|
||||
should be consistent with the performance and availability
|
||||
requirements of the cloud.
|
||||
|
||||
|
||||
Network hardware requirements
|
||||
-----------------------------
|
||||
|
||||
For a compute-focus architecture, we recommend designing the network
|
||||
architecture using a scalable network model that makes it easy to add
|
||||
capacity and bandwidth. A good example of such a model is the leaf-spine
|
||||
model. In this type of network design, you can add additional
|
||||
bandwidth as well as scale out to additional racks of gear. It is important to
|
||||
select network hardware that supports port count, port speed, and
|
||||
port density while allowing for future growth as workload demands
|
||||
increase. In the network architecture, it is also important to evaluate
|
||||
where to provide redundancy.
|
||||
|
||||
Network software requirements
|
||||
-----------------------------
|
||||
For a general purpose OpenStack cloud, the OpenStack infrastructure
|
||||
components need to be highly available. If the design does not include
|
||||
hardware load balancing, networking software packages like HAProxy will
|
||||
need to be included.
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
@ -1,181 +0,0 @@
|
||||
.. _nfv-cloud:
|
||||
|
||||
==============================
|
||||
Network virtual function cloud
|
||||
==============================
|
||||
|
||||
|
||||
Design model
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
|
||||
Network-focused cloud examples
|
||||
------------------------------
|
||||
|
||||
An organization designs a large scale cloud-based web application. The
|
||||
application scales horizontally in a bursting behavior and generates a
|
||||
high instance count. The application requires an SSL connection to secure
|
||||
data and must not lose connection state to individual servers.
|
||||
|
||||
The figure below depicts an example design for this workload. In this
|
||||
example, a hardware load balancer provides SSL offload functionality and
|
||||
connects to tenant networks in order to reduce address consumption. This
|
||||
load balancer links to the routing architecture as it services the VIP
|
||||
for the application. The router and load balancer use the GRE tunnel ID
|
||||
of the application's tenant network and an IP address within the tenant
|
||||
subnet but outside of the address pool. This is to ensure that the load
|
||||
balancer can communicate with the application's HTTP servers without
|
||||
requiring the consumption of a public IP address.
|
||||
|
||||
Because sessions persist until closed, the routing and switching
|
||||
architecture provides high availability. Switches mesh to each
|
||||
hypervisor and each other, and also provide an MLAG implementation to
|
||||
ensure that layer-2 connectivity does not fail. Routers use VRRP and
|
||||
fully mesh with switches to ensure layer-3 connectivity. Since GRE
|
||||
provides an overlay network, Networking is present and uses the Open
|
||||
vSwitch agent in GRE tunnel mode. This ensures all devices can reach all
|
||||
other devices and that you can create tenant networks for private
|
||||
addressing links to the load balancer.
|
||||
|
||||
.. figure:: ../figures/Network_Web_Services1.png
|
||||
|
||||
A web service architecture has many options and optional components. Due
|
||||
to this, it can fit into a large number of other OpenStack designs. A
|
||||
few key components, however, need to be in place to handle the nature of
|
||||
most web-scale workloads. You require the following components:
|
||||
|
||||
* OpenStack Controller services (Image service, Identity service, Networking
|
||||
service, and supporting services such as MariaDB and RabbitMQ)
|
||||
|
||||
* OpenStack Compute running KVM hypervisor
|
||||
|
||||
* OpenStack Object Storage
|
||||
|
||||
* Orchestration service
|
||||
|
||||
* Telemetry service
|
||||
|
||||
Beyond the normal Identity service, Compute service, Image service, and
|
||||
Object Storage components, we recommend the Orchestration service component
|
||||
to handle the proper scaling of workloads to adjust to demand. Due to the
|
||||
requirement for auto-scaling, the design includes the Telemetry service.
|
||||
Web services tend to be bursty in load, have very defined peak and
|
||||
valley usage patterns and, as a result, benefit from automatic scaling
|
||||
of instances based upon traffic. At a network level, a split network
|
||||
configuration works well with databases residing on private tenant
|
||||
networks since these do not emit a large quantity of broadcast traffic
|
||||
and may need to interconnect to some databases for content.
|
||||
|
||||
Load balancing
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
Load balancing spreads requests across multiple instances. This workload
|
||||
scales well horizontally across large numbers of instances. This enables
|
||||
instances to run without publicly routed IP addresses and instead to
|
||||
rely on the load balancer to provide a globally reachable service. Many
|
||||
of these services do not require direct server return. This aids in
|
||||
address planning and utilization at scale since only the virtual IP
|
||||
(VIP) must be public.
|
||||
|
||||
Overlay networks
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
The overlay functionality design includes OpenStack Networking in Open
|
||||
vSwitch GRE tunnel mode. In this case, the layer-3 external routers pair
|
||||
with VRRP, and switches pair with an implementation of MLAG to ensure
|
||||
that you do not lose connectivity with the upstream routing
|
||||
infrastructure.
|
||||
|
||||
Performance tuning
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Network level tuning for this workload is minimal. :term:`Quality of
|
||||
Service (QoS)` applies to these workloads for a middle ground Class
|
||||
Selector depending on existing policies. It is higher than a best effort
|
||||
queue but lower than an Expedited Forwarding or Assured Forwarding
|
||||
queue. Since this type of application generates larger packets with
|
||||
longer-lived connections, you can optimize bandwidth utilization for
|
||||
long duration TCP. Normal bandwidth planning applies here with regards
|
||||
to benchmarking a session's usage multiplied by the expected number of
|
||||
concurrent sessions with overhead.
|
||||
|
||||
Network functions
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
Network functions is a broad category but encompasses workloads that
|
||||
support the rest of a system's network. These workloads tend to consist
|
||||
of large amounts of small packets that are very short lived, such as DNS
|
||||
queries or SNMP traps. These messages need to arrive quickly and do not
|
||||
deal with packet loss as there can be a very large volume of them. There
|
||||
are a few extra considerations to take into account for this type of
|
||||
workload and this can change a configuration all the way to the
|
||||
hypervisor level. For an application that generates 10 TCP sessions per
|
||||
user with an average bandwidth of 512 kilobytes per second per flow and
|
||||
expected user count of ten thousand concurrent users, the expected
|
||||
bandwidth plan is approximately 4.88 gigabits per second.
|
||||
|
||||
The supporting network for this type of configuration needs to have a
|
||||
low latency and evenly distributed availability. This workload benefits
|
||||
from having services local to the consumers of the service. Use a
|
||||
multi-site approach as well as deploying many copies of the application
|
||||
to handle load as close as possible to consumers. Since these
|
||||
applications function independently, they do not warrant running
|
||||
overlays to interconnect tenant networks. Overlays also have the
|
||||
drawback of performing poorly with rapid flow setup and may incur too
|
||||
much overhead with large quantities of small packets and therefore we do
|
||||
not recommend them.
|
||||
|
||||
QoS is desirable for some workloads to ensure delivery. DNS has a major
|
||||
impact on the load times of other services and needs to be reliable and
|
||||
provide rapid responses. Configure rules in upstream devices to apply a
|
||||
higher Class Selector to DNS to ensure faster delivery or a better spot
|
||||
in queuing algorithms.
|
||||
|
||||
Cloud storage
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
Another common use case for OpenStack environments is providing a
|
||||
cloud-based file storage and sharing service. You might consider this a
|
||||
storage-focused use case, but its network-side requirements make it a
|
||||
network-focused use case.
|
||||
|
||||
For example, consider a cloud backup application. This workload has two
|
||||
specific behaviors that impact the network. Because this workload is an
|
||||
externally-facing service and an internally-replicating application, it
|
||||
has both :term:`north-south<north-south traffic>` and
|
||||
:term:`east-west<east-west traffic>` traffic considerations:
|
||||
|
||||
north-south traffic
|
||||
When a user uploads and stores content, that content moves into the
|
||||
OpenStack installation. When users download this content, the
|
||||
content moves out from the OpenStack installation. Because this
|
||||
service operates primarily as a backup, most of the traffic moves
|
||||
southbound into the environment. In this situation, it benefits you
|
||||
to configure a network to be asymmetrically downstream because the
|
||||
traffic that enters the OpenStack installation is greater than the
|
||||
traffic that leaves the installation.
|
||||
|
||||
east-west traffic
|
||||
Likely to be fully symmetric. Because replication originates from
|
||||
any node and might target multiple other nodes algorithmically, it
|
||||
is less likely for this traffic to have a larger volume in any
|
||||
specific direction. However, this traffic might interfere with
|
||||
north-south traffic.
|
||||
|
||||
.. figure:: ../figures/Network_Cloud_Storage2.png
|
||||
|
||||
This application prioritizes the north-south traffic over east-west
|
||||
traffic: the north-south traffic involves customer-facing data.
|
||||
|
||||
The network design, in this case, is less dependent on availability and
|
||||
more dependent on being able to handle high bandwidth. As a direct
|
||||
result, it is beneficial to forgo redundant links in favor of bonding
|
||||
those connections. This increases available bandwidth. It is also
|
||||
beneficial to configure all devices in the path, including OpenStack, to
|
||||
generate and pass jumbo frames.
|
@ -1,210 +0,0 @@
|
||||
.. _storage-cloud:
|
||||
|
||||
=============
|
||||
Storage cloud
|
||||
=============
|
||||
|
||||
Design model
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Storage-focused architecture depends on specific use cases. This section
|
||||
discusses three example use cases:
|
||||
|
||||
* An object store with a RESTful interface
|
||||
|
||||
* Compute analytics with parallel file systems
|
||||
|
||||
* High performance database
|
||||
|
||||
|
||||
An object store with a RESTful interface
|
||||
----------------------------------------
|
||||
|
||||
The example below shows a REST interface without a high performance
|
||||
requirement. The following diagram depicts the example architecture:
|
||||
|
||||
.. figure:: ../figures/Storage_Object.png
|
||||
|
||||
The example REST interface, presented as a traditional Object Store
|
||||
running on traditional spindles, does not require a high performance
|
||||
caching tier.
|
||||
|
||||
This example uses the following components:
|
||||
|
||||
Network:
|
||||
|
||||
* 10 GbE horizontally scalable spine leaf back-end storage and front
|
||||
end network.
|
||||
|
||||
Storage hardware:
|
||||
|
||||
* 10 storage servers each with 12x4 TB disks equaling 480 TB total
|
||||
space with approximately 160 TB of usable space after replicas.
|
||||
|
||||
Proxy:
|
||||
|
||||
* 3x proxies
|
||||
|
||||
* 2x10 GbE bonded front end
|
||||
|
||||
* 2x10 GbE back-end bonds
|
||||
|
||||
* Approximately 60 Gb of total bandwidth to the back-end storage
|
||||
cluster
|
||||
|
||||
.. note::
|
||||
|
||||
It may be necessary to implement a third party caching layer for some
|
||||
applications to achieve suitable performance.
|
||||
|
||||
|
||||
|
||||
Compute analytics with data processing service
|
||||
----------------------------------------------
|
||||
|
||||
Analytics of large data sets are dependent on the performance of the
|
||||
storage system. Clouds using storage systems such as Hadoop Distributed
|
||||
File System (HDFS) have inefficiencies which can cause performance
|
||||
issues.
|
||||
|
||||
One potential solution to this problem is the implementation of storage
|
||||
systems designed for performance. Parallel file systems have previously
|
||||
filled this need in the HPC space and are suitable for large scale
|
||||
performance-orientated systems.
|
||||
|
||||
OpenStack has integration with Hadoop to manage the Hadoop cluster
|
||||
within the cloud. The following diagram shows an OpenStack store with a
|
||||
high performance requirement:
|
||||
|
||||
.. figure:: ../figures/Storage_Hadoop3.png
|
||||
|
||||
The hardware requirements and configuration are similar to those of the
|
||||
High Performance Database example below. In this case, the architecture
|
||||
uses Ceph's Swift-compatible REST interface, features that allow for
|
||||
connecting a caching pool to allow for acceleration of the presented
|
||||
pool.
|
||||
|
||||
High performance database with Database service
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Databases are a common workload that benefit from high performance
|
||||
storage back ends. Although enterprise storage is not a requirement,
|
||||
many environments have existing storage that OpenStack cloud can use as
|
||||
back ends. You can create a storage pool to provide block devices with
|
||||
OpenStack Block Storage for instances as well as object interfaces. In
|
||||
this example, the database I-O requirements are high and demand storage
|
||||
presented from a fast SSD pool.
|
||||
|
||||
A storage system presents a LUN backed by a set of SSDs using a
|
||||
traditional storage array with OpenStack Block Storage integration or a
|
||||
storage platform such as Ceph or Gluster.
|
||||
|
||||
This system can provide additional performance. For example, in the
|
||||
database example below, a portion of the SSD pool can act as a block
|
||||
device to the Database server. In the high performance analytics
|
||||
example, the inline SSD cache layer accelerates the REST interface.
|
||||
|
||||
.. figure:: ../figures/Storage_Database_+_Object5.png
|
||||
|
||||
In this example, Ceph presents a swift-compatible REST interface, as
|
||||
well as a block level storage from a distributed storage cluster. It is
|
||||
highly flexible and has features that enable reduced cost of operations
|
||||
such as self healing and auto balancing. Using erasure coded pools are a
|
||||
suitable way of maximizing the amount of usable space.
|
||||
|
||||
.. note::
|
||||
|
||||
There are special considerations around erasure coded pools. For
|
||||
example, higher computational requirements and limitations on the
|
||||
operations allowed on an object; erasure coded pools do not support
|
||||
partial writes.
|
||||
|
||||
Using Ceph as an applicable example, a potential architecture would have
|
||||
the following requirements:
|
||||
|
||||
Network:
|
||||
|
||||
* 10 GbE horizontally scalable spine leaf back-end storage and
|
||||
front-end network
|
||||
|
||||
Storage hardware:
|
||||
|
||||
* 5 storage servers for caching layer 24x1 TB SSD
|
||||
|
||||
* 10 storage servers each with 12x4 TB disks which equals 480 TB total
|
||||
space with about approximately 160 TB of usable space after 3
|
||||
replicas
|
||||
|
||||
REST proxy:
|
||||
|
||||
* 3x proxies
|
||||
|
||||
* 2x10 GbE bonded front end
|
||||
|
||||
* 2x10 GbE back-end bonds
|
||||
|
||||
* Approximately 60 Gb of total bandwidth to the back-end storage
|
||||
cluster
|
||||
|
||||
Using an SSD cache layer, you can present block devices directly to
|
||||
hypervisors or instances. The REST interface can also use the SSD cache
|
||||
systems as an inline cache.
|
||||
|
||||
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Storage requirements
|
||||
--------------------
|
||||
|
||||
Storage-focused OpenStack clouds must address I/O intensive workloads.
|
||||
These workloads are not CPU intensive, nor are they consistently network
|
||||
intensive. The network may be heavily utilized to transfer storage, but
|
||||
they are not otherwise network intensive.
|
||||
|
||||
The selection of storage hardware determines the overall performance and
|
||||
scalability of a storage-focused OpenStack design architecture. Several
|
||||
factors impact the design process, including:
|
||||
|
||||
Latency
|
||||
A key consideration in a storage-focused OpenStack cloud is latency.
|
||||
Using solid-state disks (SSDs) to minimize latency and, to reduce CPU
|
||||
delays caused by waiting for the storage, increases performance. Use
|
||||
RAID controller cards in compute hosts to improve the performance of the
|
||||
underlying disk subsystem.
|
||||
|
||||
Scale-out solutions
|
||||
Depending on the storage architecture, you can adopt a scale-out
|
||||
solution, or use a highly expandable and scalable centralized storage
|
||||
array. If a centralized storage array meets your requirements, then the
|
||||
array vendor determines the hardware selection. It is possible to build
|
||||
a storage array using commodity hardware with Open Source software, but
|
||||
requires people with expertise to build such a system.
|
||||
|
||||
On the other hand, a scale-out storage solution that uses
|
||||
direct-attached storage (DAS) in the servers may be an appropriate
|
||||
choice. This requires configuration of the server hardware to support
|
||||
the storage solution.
|
||||
|
||||
Considerations affecting storage architecture (and corresponding storage
|
||||
hardware) of a Storage-focused OpenStack cloud include:
|
||||
|
||||
Connectivity
|
||||
Ensure the connectivity matches the storage solution requirements. We
|
||||
recommend confirming that the network characteristics minimize latency
|
||||
to boost the overall performance of the design.
|
||||
|
||||
Latency
|
||||
Determine if the use case has consistent or highly variable latency.
|
||||
|
||||
Throughput
|
||||
Ensure that the storage solution throughput is optimized for your
|
||||
application requirements.
|
||||
|
||||
Server hardware
|
||||
Use of DAS impacts the server hardware choice and affects host
|
||||
density, instance density, power density, OS-hypervisor, and
|
||||
management tools.
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
@ -1,14 +0,0 @@
|
||||
.. _web-scale-cloud:
|
||||
|
||||
===============
|
||||
Web scale cloud
|
||||
===============
|
||||
|
||||
Design model
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
@ -27,11 +27,10 @@ while [[ $# > 0 ]] ; do
|
||||
done
|
||||
|
||||
# PDF targets for Install guides are dealt in build-install-guides-rst.sh
|
||||
PDF_TARGETS=( 'arch-design'\
|
||||
'image-guide' \
|
||||
PDF_TARGETS=( 'image-guide' \
|
||||
'install-guide')
|
||||
|
||||
for guide in arch-design doc-contrib-guide glossary \
|
||||
for guide in doc-contrib-guide glossary \
|
||||
image-guide install-guide; do
|
||||
if [[ ${PDF_TARGETS[*]} =~ $guide ]]; then
|
||||
tools/build-rst.sh doc/$guide --build build \
|
||||
|