Initial import of arch-design from openstack-manuals

This imports the docs for the Architecture Design Guide out of the
openstack-manuals repo to be an independent repo owned by the Ops Docs
SIG.

Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
This commit is contained in:
Sean McGinnis 2018-11-29 14:23:52 -06:00
commit 499187b981
No known key found for this signature in database
GPG Key ID: CE7EE4BFAF8D70C8
78 changed files with 20161 additions and 0 deletions

22
.gitignore vendored Normal file
View File

@ -0,0 +1,22 @@
.DS_Store
*.xpr
# Packages
.venv
*.egg
*.egg-info
# Testenvironment
.tox
# Build directories
doc/build
# Transifex Client Setting
.tx
# Editors
*~
.*.swp
.bak
*.pyc

4
.gitreview Normal file
View File

@ -0,0 +1,4 @@
[gerrit]
host=review.openstack.org
port=29418
project=openstack/arch-guide.git

6
.zuul.yaml Normal file
View File

@ -0,0 +1,6 @@
- project:
templates:
- build-openstack-docs-pti
post:
jobs:
- publish-openstack-tox-docs-direct

176
LICENSE Normal file
View File

@ -0,0 +1,176 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.

80
README.rst Normal file
View File

@ -0,0 +1,80 @@
===================================
OpenStack Architecture Design Guide
===================================
This repository contains the source files for the OpenStack Architecture Guide.
You can read this guide at `docs.openstack.org/arch-design
<http://docs.openstack.org/arch-design>`_.
Prerequisites
-------------
At a minimum, you will need git and the git-review tool installed in order to
contribute documentation. You will also need a `Gerrit account
<https://docs.openstack.org/infra/manual/developers.html#account-setup>`_ to
submit the change.
Git is available for Linux, Mac, and Windows environements. Some platforms come
with it preinstalled, but you can review the `installation instructions
<https://git-scm.com/book/en/v2/Getting-Started-Installing-Git>`_ if you
do not have it by default.
Once git is installed, you can follow the instructions for your platform to
`install git-review <https://www.mediawiki.org/wiki/Gerrit/git-review>`_.
The last step is to configure git with your name and email address used for
your Gerrit account set up so it can link you patch to your user. Run the
following to set these values:
.. code-block:: console
git config --global user.name "First Last"
git config --global user.email "your_email@youremail.com"
Submitting Updates
------------------
Proposing updates to the documentation is fairly straight forward once you've
done it, but there are a few steps that can appear intimidating your first
couple times through. Here is a suggested workflow to help you along the way.
.. code-block:: console
git clone https://git.openstack.org/openstack/arch-design
cd arch-design
# it is useful to make changes on a separate branch in case you need to make
# other changes
git checkout -b my-topic
# edit your files
git add .
git commit # Add a descriptive commit message
# submit your changes for review
git review
The changes will then be run through a few tests to make sure the docs build
and it will be ready for reviews. Once reviewed, if no problems are found with
the changes they will be merged to the repo and the changes will be published
to the docs.openstack.org site.
Local Testing
-------------
If you would like to build the docs locally to make sure there are no issues
with the changes, and to view locally generated HTML files, you will need to do
a couple extra steps.
The jobs are run using a tool called `tox`. You will need to install tox on
your platform first following its `installation guide
<https://tox.readthedocs.io/en/latest/install.html>`_.
You can then run the following to perform a local build with some tests:
.. code-block:: console
tox -e docs
If you have any questions, please reach out on the #openstack-operators IRC
channel or through the openstack-discuss mailing list.

6
doc/requirements.txt Normal file
View File

@ -0,0 +1,6 @@
# The order of packages is significant, because pip processes them in the order
# of appearance. Changing the order has an impact on the overall integration
# process, which may cause wedges in the gate later.
openstackdocstheme>=1.27.1 # Apache-2.0
doc8>=0.6.0 # Apache-2.0
sphinx!=1.6.6,!=1.6.7,>=1.6.2 # BSD

View File

@ -0,0 +1,13 @@
=========================
Architecture requirements
=========================
This chapter describes the enterprise and operational factors that impacts the
design of an OpenStack cloud.
.. toctree::
:maxdepth: 2
arch-requirements/arch-requirements-enterprise
arch-requirements/arch-requirements-operations
arch-requirements/arch-requirements-ha

View File

@ -0,0 +1,433 @@
=======================
Enterprise requirements
=======================
The following sections describe business, usage, and performance
considerations for customers which will impact cloud architecture design.
Cost
~~~~
Financial factors are a primary concern for any organization. Cost
considerations may influence the type of cloud that you build.
For example, a general purpose cloud is unlikely to be the most
cost-effective environment for specialized applications.
Unless business needs dictate that cost is a critical factor,
cost should not be the sole consideration when choosing or designing a cloud.
As a general guideline, increasing the complexity of a cloud architecture
increases the cost of building and maintaining it. For example, a hybrid or
multi-site cloud architecture involving multiple vendors and technical
architectures may require higher setup and operational costs because of the
need for more sophisticated orchestration and brokerage tools than in other
architectures. However, overall operational costs might be lower by virtue of
using a cloud brokerage tool to deploy the workloads to the most cost effective
platform.
.. TODO Replace examples with the proposed example use cases in this guide.
Consider the following costs categories when designing a cloud:
* Compute resources
* Networking resources
* Replication
* Storage
* Management
* Operational costs
It is also important to consider how costs will increase as your cloud scales.
Choices that have a negligible impact in small systems may considerably
increase costs in large systems. In these cases, it is important to minimize
capital expenditure (CapEx) at all layers of the stack. Operators of massively
scalable OpenStack clouds require the use of dependable commodity hardware and
freely available open source software components to reduce deployment costs and
operational expenses. Initiatives like Open Compute (more information available
in the `Open Compute Project <http://www.opencompute.org>`_) provide additional
information.
Time-to-market
~~~~~~~~~~~~~~
The ability to deliver services or products within a flexible time
frame is a common business factor when building a cloud. Allowing users to
self-provision and gain access to compute, network, and
storage resources on-demand may decrease time-to-market for new products
and applications.
You must balance the time required to build a new cloud platform against the
time saved by migrating users away from legacy platforms. In some cases,
existing infrastructure may influence your architecture choices. For example,
using multiple cloud platforms may be a good option when there is an existing
investment in several applications, as it could be faster to tie the
investments together rather than migrating the components and refactoring them
to a single platform.
Revenue opportunity
~~~~~~~~~~~~~~~~~~~
Revenue opportunities vary based on the intent and use case of the cloud.
The requirements of a commercial, customer-facing product are often very
different from an internal, private cloud. You must consider what features
make your design most attractive to your users.
Capacity planning and scalability
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Capacity and the placement of workloads are key design considerations
for clouds. A long-term capacity plan for these designs must
incorporate growth over time to prevent permanent consumption of more
expensive external clouds. To avoid this scenario, account for future
applications' capacity requirements and plan growth appropriately.
It is difficult to predict the amount of load a particular
application might incur if the number of users fluctuates, or the
application experiences an unexpected increase in use.
It is possible to define application requirements in terms of
vCPU, RAM, bandwidth, or other resources and plan appropriately.
However, other clouds might not use the same meter or even the same
oversubscription rates.
Oversubscription is a method to emulate more capacity than
may physically be present. For example, a physical hypervisor node with 32 GB
RAM may host 24 instances, each provisioned with 2 GB RAM.
As long as all 24 instances do not concurrently use 2 full
gigabytes, this arrangement works well.
However, some hosts take oversubscription to extremes and,
as a result, performance can be inconsistent.
If at all possible, determine what the oversubscription rates
of each host are and plan capacity accordingly.
.. TODO Considerations when building your cloud, racks, CPUs, compute node
density. For ongoing capacity planning refer to the Ops Guide.
Performance
~~~~~~~~~~~
Performance is a critical consideration when designing any cloud, and becomes
increasingly important as size and complexity grow. While single-site, private
clouds can be closely controlled, multi-site and hybrid deployments require
more careful planning to reduce problems such as network latency between sites.
For example, you should consider the time required to
run a workload in different clouds and methods for reducing this time.
This may require moving data closer to applications or applications
closer to the data they process, and grouping functionality so that
connections that require low latency take place over a single cloud
rather than spanning clouds.
This may also require a CMP that can determine which cloud can most
efficiently run which types of workloads.
Using native OpenStack tools can help improve performance.
For example, you can use Telemetry to measure performance and the
Orchestration service (heat) to react to changes in demand.
.. note::
Orchestration requires special client configurations to integrate
with Amazon Web Services. For other types of clouds, use CMP features.
Cloud resource deployment
The cloud user expects repeatable, dependable, and deterministic processes
for launching and deploying cloud resources. You could deliver this through
a web-based interface or publicly available API endpoints. All appropriate
options for requesting cloud resources must be available through some type
of user interface, a command-line interface (CLI), or API endpoints.
Consumption model
Cloud users expect a fully self-service and on-demand consumption model.
When an OpenStack cloud reaches the massively scalable size, expect
consumption as a service in each and every way.
* Everything must be capable of automation. For example, everything from
compute hardware, storage hardware, networking hardware, to the installation
and configuration of the supporting software. Manual processes are
impractical in a massively scalable OpenStack design architecture.
* Massively scalable OpenStack clouds require extensive metering and
monitoring functionality to maximize the operational efficiency by keeping
the operator informed about the status and state of the infrastructure. This
includes full scale metering of the hardware and software status. A
corresponding framework of logging and alerting is also required to store
and enable operations to act on the meters provided by the metering and
monitoring solutions. The cloud operator also needs a solution that uses the
data provided by the metering and monitoring solution to provide capacity
planning and capacity trending analysis.
Location
For many use cases the proximity of the user to their workloads has a
direct influence on the performance of the application and therefore
should be taken into consideration in the design. Certain applications
require zero to minimal latency that can only be achieved by deploying
the cloud in multiple locations. These locations could be in different
data centers, cities, countries or geographical regions, depending on
the user requirement and location of the users.
Input-Output requirements
Input-Output performance requirements require researching and
modeling before deciding on a final storage framework. Running
benchmarks for Input-Output performance provides a baseline for
expected performance levels. If these tests include details, then
the resulting data can help model behavior and results during
different workloads. Running scripted smaller benchmarks during the
lifecycle of the architecture helps record the system health at
different points in time. The data from these scripted benchmarks
assist in future scoping and gaining a deeper understanding of an
organization's needs.
Scale
Scaling storage solutions in a storage-focused OpenStack
architecture design is driven by initial requirements, including
:term:`IOPS <Input/output Operations Per Second (IOPS)>`, capacity,
bandwidth, and future needs. Planning capacity based on projected needs
over the course of a budget cycle is important for a design. The
architecture should balance cost and capacity, while also allowing
flexibility to implement new technologies and methods as they become
available.
Network
~~~~~~~
It is important to consider the functionality, security, scalability,
availability, and testability of the network when choosing a CMP and cloud
provider.
* Decide on a network framework and design minimum functionality tests.
This ensures testing and functionality persists during and after
upgrades.
* Scalability across multiple cloud providers may dictate which underlying
network framework you choose in different cloud providers.
It is important to present the network API functions and to verify
that functionality persists across all cloud endpoints chosen.
* High availability implementations vary in functionality and design.
Examples of some common methods are active-hot-standby, active-passive,
and active-active.
Development of high availability and test frameworks is necessary to
insure understanding of functionality and limitations.
* Consider the security of data between the client and the endpoint,
and of traffic that traverses the multiple clouds.
For example, degraded video streams and low quality VoIP sessions negatively
impact user experience and may lead to productivity and economic loss.
Network misconfigurations
Configuring incorrect IP addresses, VLANs, and routers can cause
outages to areas of the network or, in the worst-case scenario, the
entire cloud infrastructure. Automate network configurations to
minimize the opportunity for operator error as it can cause
disruptive problems.
Capacity planning
Cloud networks require management for capacity and growth over time.
Capacity planning includes the purchase of network circuits and
hardware that can potentially have lead times measured in months or
years.
Network tuning
Configure cloud networks to minimize link loss, packet loss, packet
storms, broadcast storms, and loops.
Single Point Of Failure (SPOF)
Consider high availability at the physical and environmental layers.
If there is a single point of failure due to only one upstream link,
or only one power supply, an outage can become unavoidable.
Complexity
An overly complex network design can be difficult to maintain and
troubleshoot. While device-level configuration can ease maintenance
concerns and automated tools can handle overlay networks, avoid or
document non-traditional interconnects between functions and
specialized hardware to prevent outages.
Non-standard features
There are additional risks that arise from configuring the cloud
network to take advantage of vendor specific features. One example
is multi-link aggregation (MLAG) used to provide redundancy at the
aggregator switch level of the network. MLAG is not a standard and,
as a result, each vendor has their own proprietary implementation of
the feature. MLAG architectures are not interoperable across switch
vendors, which leads to vendor lock-in, and can cause delays or
inability when upgrading components.
Dynamic resource expansion or bursting
An application that requires additional resources may suit a multiple
cloud architecture. For example, a retailer needs additional resources
during the holiday season, but does not want to add private cloud
resources to meet the peak demand.
The user can accommodate the increased load by bursting to
a public cloud for these peak load periods. These bursts could be
for long or short cycles ranging from hourly to yearly.
Compliance and geo-location
~~~~~~~~~~~~~~~~~~~~~~~~~~~
An organization may have certain legal obligations and regulatory
compliance measures which could require certain workloads or data to not
be located in certain regions.
Compliance considerations are particularly important for multi-site clouds.
Considerations include:
- federal legal requirements
- local jurisdictional legal and compliance requirements
- image consistency and availability
- storage replication and availability (both block and file/object storage)
- authentication, authorization, and auditing (AAA)
Geographical considerations may also impact the cost of building or leasing
data centers. Considerations include:
- floor space
- floor weight
- rack height and type
- environmental considerations
- power usage and power usage efficiency (PUE)
- physical security
Auditing
~~~~~~~~
A well-considered auditing plan is essential for quickly finding issues.
Keeping track of changes made to security groups and tenant changes can be
useful in rolling back the changes if they affect production. For example,
if all security group rules for a tenant disappeared, the ability to quickly
track down the issue would be important for operational and legal reasons.
For more details on auditing, see the `Compliance chapter
<https://docs.openstack.org/security-guide/compliance.html>`_ in the OpenStack
Security Guide.
Security
~~~~~~~~
The importance of security varies based on the type of organization using
a cloud. For example, government and financial institutions often have
very high security requirements. Security should be implemented according to
asset, threat, and vulnerability risk assessment matrices.
See `security-requirements`.
Service level agreements
~~~~~~~~~~~~~~~~~~~~~~~~
Service level agreements (SLA) must be developed in conjunction with business,
technical, and legal input. Small, private clouds may operate under an informal
SLA, but hybrid or public clouds generally require more formal agreements with
their users.
For a user of a massively scalable OpenStack public cloud, there are no
expectations for control over security, performance, or availability. Users
expect only SLAs related to uptime of API services, and very basic SLAs for
services offered. It is the user's responsibility to address these issues on
their own. The exception to this expectation is the rare case of a massively
scalable cloud infrastructure built for a private or government organization
that has specific requirements.
High performance systems have SLA requirements for a minimum quality of service
with regard to guaranteed uptime, latency, and bandwidth. The level of the
SLA can have a significant impact on the network architecture and
requirements for redundancy in the systems.
Hybrid cloud designs must accommodate differences in SLAs between providers,
and consider their enforceability.
Application readiness
~~~~~~~~~~~~~~~~~~~~~
Some applications are tolerant of a lack of synchronized object
storage, while others may need those objects to be replicated and
available across regions. Understanding how the cloud implementation
impacts new and existing applications is important for risk mitigation,
and the overall success of a cloud project. Applications may have to be
written or rewritten for an infrastructure with little to no redundancy,
or with the cloud in mind.
Application momentum
Businesses with existing applications may find that it is
more cost effective to integrate applications on multiple
cloud platforms than migrating them to a single platform.
No predefined usage model
The lack of a pre-defined usage model enables the user to run a wide
variety of applications without having to know the application
requirements in advance. This provides a degree of independence and
flexibility that no other cloud scenarios are able to provide.
On-demand and self-service application
By definition, a cloud provides end users with the ability to
self-provision computing power, storage, networks, and software in a
simple and flexible way. The user must be able to scale their
resources up to a substantial level without disrupting the
underlying host operations. One of the benefits of using a general
purpose cloud architecture is the ability to start with limited
resources and increase them over time as the user demand grows.
Authentication
~~~~~~~~~~~~~~
It is recommended to have a single authentication domain rather than a
separate implementation for each and every site. This requires an
authentication mechanism that is highly available and distributed to
ensure continuous operation. Authentication server locality might be
required and should be planned for.
Migration, availability, site loss and recovery
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Outages can cause partial or full loss of site functionality. Strategies
should be implemented to understand and plan for recovery scenarios.
* The deployed applications need to continue to function and, more
importantly, you must consider the impact on the performance and
reliability of the application when a site is unavailable.
* It is important to understand what happens to the replication of
objects and data between the sites when a site goes down. If this
causes queues to start building up, consider how long these queues
can safely exist until an error occurs.
* After an outage, ensure the method for resuming proper operations of
a site is implemented when it comes back online. We recommend you
architect the recovery to avoid race conditions.
Disaster recovery and business continuity
Cheaper storage makes the public cloud suitable for maintaining
backup applications.
Migration scenarios
Hybrid cloud architecture enables the migration of
applications between different clouds.
Provider availability or implementation details
Business changes can affect provider availability.
Likewise, changes in a provider's service can disrupt
a hybrid cloud environment or increase costs.
Provider API changes
Consumers of external clouds rarely have control over provider
changes to APIs, and changes can break compatibility.
Using only the most common and basic APIs can minimize potential conflicts.
Image portability
As of the Kilo release, there is no common image format that is
usable by all clouds. Conversion or recreation of images is necessary
if migrating between clouds. To simplify deployment, use the smallest
and simplest images feasible, install only what is necessary, and
use a deployment manager such as Chef or Puppet. Do not use golden
images to speed up the process unless you repeatedly deploy the same
images on the same cloud.
API differences
Avoid using a hybrid cloud deployment with more than just
OpenStack (or with different versions of OpenStack) as API changes
can cause compatibility issues.
Business or technical diversity
Organizations leveraging cloud-based services can embrace business
diversity and utilize a hybrid cloud design to spread their
workloads across multiple cloud providers. This ensures that
no single cloud provider is the sole host for an application.

View File

@ -0,0 +1,182 @@
.. _high-availability:
=================
High availability
=================
Data plane and control plane
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When designing an OpenStack cloud, it is important to consider the needs
dictated by the :term:`Service Level Agreement (SLA)`. This includes the core
services required to maintain availability of running Compute service
instances, networks, storage, and additional services running on top of those
resources. These services are often referred to as the Data Plane services,
and are generally expected to be available all the time.
The remaining services, responsible for create, read, update and delete (CRUD)
operations, metering, monitoring, and so on, are often referred to as the
Control Plane. The SLA is likely to dictate a lower uptime requirement for
these services.
The services comprising an OpenStack cloud have a number of requirements that
you need to understand in order to be able to meet SLA terms. For example, in
order to provide the Compute service a minimum of storage, message queueing and
database services are necessary as well as the networking between
them.
Ongoing maintenance operations are made much simpler if there is logical and
physical separation of Data Plane and Control Plane systems. It then becomes
possible to, for example, reboot a controller without affecting customers.
If one service failure affects the operation of an entire server (``noisy
neighbor``), the separation between Control and Data Planes enables rapid
maintenance with a limited effect on customer operations.
Eliminating single points of failure within each site
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
OpenStack lends itself to deployment in a highly available manner where it is
expected that at least 2 servers be utilized. These can run all the services
involved from the message queuing service, for example ``RabbitMQ`` or
``QPID``, and an appropriately deployed database service such as ``MySQL`` or
``MariaDB``. As services in the cloud are scaled out, back-end services will
need to scale too. Monitoring and reporting on server utilization and response
times, as well as load testing your systems, will help determine scale out
decisions.
The OpenStack services themselves should be deployed across multiple servers
that do not represent a single point of failure. Ensuring availability can
be achieved by placing these services behind highly available load balancers
that have multiple OpenStack servers as members.
There are a small number of OpenStack services which are intended to only run
in one place at a time (for example, the ``ceilometer-agent-central`` service)
. In order to prevent these services from becoming a single point of failure,
they can be controlled by clustering software such as ``Pacemaker``.
In OpenStack, the infrastructure is integral to providing services and should
always be available, especially when operating with SLAs. Ensuring network
availability is accomplished by designing the network architecture so that no
single point of failure exists. A consideration of the number of switches,
routes and redundancies of power should be factored into core infrastructure,
as well as the associated bonding of networks to provide diverse routes to your
highly available switch infrastructure.
Care must be taken when deciding network functionality. Currently, OpenStack
supports both the legacy networking (nova-network) system and the newer,
extensible OpenStack Networking (neutron). OpenStack Networking and legacy
networking both have their advantages and disadvantages. They are both valid
and supported options that fit different network deployment models described in
the `OpenStack Operations Guide
<https://docs.openstack.org/ops-guide/arch_network_design.html#network-topology>`_.
When using the Networking service, the OpenStack controller servers or separate
Networking hosts handle routing unless the dynamic virtual routers pattern for
routing is selected. Running routing directly on the controller servers mixes
the Data and Control Planes and can cause complex issues with performance and
troubleshooting. It is possible to use third party software and external
appliances that help maintain highly available layer three routes. Doing so
allows for common application endpoints to control network hardware, or to
provide complex multi-tier web applications in a secure manner. It is also
possible to completely remove routing from Networking, and instead rely on
hardware routing capabilities. In this case, the switching infrastructure must
support layer three routing.
Application design must also be factored into the capabilities of the
underlying cloud infrastructure. If the compute hosts do not provide a seamless
live migration capability, then it must be expected that if a compute host
fails, that instance and any data local to that instance will be deleted.
However, when providing an expectation to users that instances have a
high-level of uptime guaranteed, the infrastructure must be deployed in a way
that eliminates any single point of failure if a compute host disappears.
This may include utilizing shared file systems on enterprise storage or
OpenStack Block storage to provide a level of guarantee to match service
features.
If using a storage design that includes shared access to centralized storage,
ensure that this is also designed without single points of failure and the SLA
for the solution matches or exceeds the expected SLA for the Data Plane.
Eliminating single points of failure in a multi-region design
-------------------------------------------------------------
Some services are commonly shared between multiple regions, including the
Identity service and the Dashboard. In this case, it is necessary to ensure
that the databases backing the services are replicated, and that access to
multiple workers across each site can be maintained in the event of losing a
single region.
Multiple network links should be deployed between sites to provide redundancy
for all components. This includes storage replication, which should be isolated
to a dedicated network or VLAN with the ability to assign QoS to control the
replication traffic or provide priority for this traffic.
.. note::
If the data store is highly changeable, the network requirements could have
a significant effect on the operational cost of maintaining the sites.
If the design incorporates more than one site, the ability to maintain object
availability in both sites has significant implications on the Object Storage
design and implementation. It also has a significant impact on the WAN network
design between the sites.
If applications running in a cloud are not cloud-aware, there should be clear
measures and expectations to define what the infrastructure can and cannot
support. An example would be shared storage between sites. It is possible,
however such a solution is not native to OpenStack and requires a third-party
hardware vendor to fulfill such a requirement. Another example can be seen in
applications that are able to consume resources in object storage directly.
Connecting more than two sites increases the challenges and adds more
complexity to the design considerations. Multi-site implementations require
planning to address the additional topology used for internal and external
connectivity. Some options include full mesh topology, hub spoke, spine leaf,
and 3D Torus.
For more information on high availability in OpenStack, see the `OpenStack High
Availability Guide <https://docs.openstack.org/ha-guide/>`_.
Site loss and recovery
~~~~~~~~~~~~~~~~~~~~~~
Outages can cause partial or full loss of site functionality. Strategies
should be implemented to understand and plan for recovery scenarios.
* The deployed applications need to continue to function and, more
importantly, you must consider the impact on the performance and
reliability of the application if a site is unavailable.
* It is important to understand what happens to the replication of
objects and data between the sites when a site goes down. If this
causes queues to start building up, consider how long these queues
can safely exist until an error occurs.
* After an outage, ensure that operations of a site are resumed when it
comes back online. We recommend that you architect the recovery to
avoid race conditions.
Replicating inter-site data
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Traditionally, replication has been the best method of protecting object store
implementations. A variety of replication methods exist in storage
architectures, for example synchronous and asynchronous mirroring. Most object
stores and back-end storage systems implement methods for replication at the
storage subsystem layer. Object stores also tailor replication techniques to
fit a cloud's requirements.
Organizations must find the right balance between data integrity and data
availability. Replication strategy may also influence disaster recovery
methods.
Replication across different racks, data centers, and geographical regions
increases focus on determining and ensuring data locality. The ability to
guarantee data is accessed from the nearest or fastest storage can be necessary
for applications to perform well.
.. note::
When running embedded object store methods, ensure that you do not
instigate extra data replication as this may cause performance issues.

View File

@ -0,0 +1,259 @@
========================
Operational requirements
========================
This section describes operational factors affecting the design of an
OpenStack cloud.
Network design
~~~~~~~~~~~~~~
The network design for an OpenStack cluster includes decisions regarding
the interconnect needs within the cluster, the need to allow clients to
access their resources, and the access requirements for operators to
administrate the cluster. You should consider the bandwidth, latency,
and reliability of these networks.
Consider additional design decisions about monitoring and alarming.
If you are using an external provider, service level agreements (SLAs)
are typically defined in your contract. Operational considerations such
as bandwidth, latency, and jitter can be part of the SLA.
As demand for network resources increase, make sure your network design
accommodates expansion and upgrades. Operators add additional IP address
blocks and add additional bandwidth capacity. In addition, consider
managing hardware and software lifecycle events, for example upgrades,
decommissioning, and outages, while avoiding service interruptions for
tenants.
Factor maintainability into the overall network design. This includes
the ability to manage and maintain IP addresses as well as the use of
overlay identifiers including VLAN tag IDs, GRE tunnel IDs, and MPLS
tags. As an example, if you may need to change all of the IP addresses
on a network, a process known as renumbering, then the design must
support this function.
Address network-focused applications when considering certain
operational realities. For example, consider the impending exhaustion of
IPv4 addresses, the migration to IPv6, and the use of private networks
to segregate different types of traffic that an application receives or
generates. In the case of IPv4 to IPv6 migrations, applications should
follow best practices for storing IP addresses. We recommend you avoid
relying on IPv4 features that did not carry over to the IPv6 protocol or
have differences in implementation.
To segregate traffic, allow applications to create a private tenant
network for database and storage network traffic. Use a public network
for services that require direct client access from the Internet. Upon
segregating the traffic, consider :term:`quality of service (QoS)` and
security to ensure each network has the required level of service.
Also consider the routing of network traffic. For some applications,
develop a complex policy framework for routing. To create a routing
policy that satisfies business requirements, consider the economic cost
of transmitting traffic over expensive links versus cheaper links, in
addition to bandwidth, latency, and jitter requirements.
Finally, consider how to respond to network events. How load
transfers from one link to another during a failure scenario could be
a factor in the design. If you do not plan network capacity
correctly, failover traffic could overwhelm other ports or network
links and create a cascading failure scenario. In this case,
traffic that fails over to one link overwhelms that link and then
moves to the subsequent links until all network traffic stops.
SLA considerations
~~~~~~~~~~~~~~~~~~
Service-level agreements (SLAs) define the levels of availability that will
impact the design of an OpenStack cloud to provide redundancy and high
availability.
SLA terms that affect the design include:
* API availability guarantees implying multiple infrastructure services
and highly available load balancers.
* Network uptime guarantees affecting switch design, which might
require redundant switching and power.
* Networking security policy requirements.
In any environment larger than just a few hosts, there are two areas
that might be subject to a SLA:
* Data Plane - services that provide virtualization, networking, and
storage. Customers usually require these services to be continuously
available.
* Control Plane - ancillary services such as API endpoints, and services that
control CRUD operations. The services in this category are usually subject to
a different SLA expectation and may be better suited on separate
hardware or containers from the Data Plane services.
To effectively run cloud installations, initial downtime planning includes
creating processes and architectures that support planned maintenance
and unplanned system faults.
It is important to determine as part of the SLA negotiation which party is
responsible for monitoring and starting up the Compute service instances if an
outage occurs.
Upgrading, patching, and changing configuration items may require
downtime for some services. Stopping services that form the Control Plane may
not impact the Data Plane. Live-migration of Compute instances may be required
to perform any actions that require downtime to Data Plane components.
There are many services outside the realms of pure OpenStack
code which affects the ability of a cloud design to meet SLAs, including:
* Database services, such as ``MySQL`` or ``PostgreSQL``.
* Services providing RPC, such as ``RabbitMQ``.
* External network attachments.
* Physical constraints such as power, rack space, network cabling, etc.
* Shared storage including SAN based arrays, storage clusters such as ``Ceph``,
and/or NFS services.
Depending on the design, some network service functions may fall into both the
Control and Data Plane categories. For example, the neutron L3 Agent service
may be considered a Control Plane component, but the routers themselves would
be a Data Plane component.
In a design with multiple regions, the SLA would also need to take into
consideration the use of shared services such as the Identity service
and Dashboard.
Any SLA negotiation must also take into account the reliance on third parties
for critical aspects of the design. For example, if there is an existing SLA
on a component such as a storage system, the SLA must take into account this
limitation. If the required SLA for the cloud exceeds the agreed uptime levels
of the cloud components, additional redundancy would be required. This
consideration is critical in a hybrid cloud design, where there are multiple
third parties involved.
Support and maintenance
~~~~~~~~~~~~~~~~~~~~~~~
An operations staff supports, manages, and maintains an OpenStack environment.
Their skills may be specialized or varied depending on the size and purpose of
the installation.
The maintenance function of an operator should be taken into consideration:
Maintenance tasks
Operating system patching, hardware/firmware upgrades, and datacenter
related changes, as well as minor and release upgrades to OpenStack
components are all ongoing operational tasks. The six monthly release
cycle of the OpenStack projects needs to be considered as part of the
cost of ongoing maintenance. The solution should take into account
storage and network maintenance and the impact on underlying
workloads.
Reliability and availability
Reliability and availability depend on the many supporting components'
availability and on the level of precautions taken by the service provider.
This includes network, storage systems, datacenter, and operating systems.
For more information on
managing and maintaining your OpenStack environment, see the
`OpenStack Operations Guide <https://docs.openstack.org/operations-guide/>`_.
Logging and monitoring
----------------------
OpenStack clouds require appropriate monitoring platforms to identify and
manage errors.
.. note::
We recommend leveraging existing monitoring systems to see if they
are able to effectively monitor an OpenStack environment.
Specific meters that are critically important to capture include:
* Image disk utilization
* Response time to the Compute API
Logging and monitoring does not significantly differ for a multi-site OpenStack
cloud. The tools described in the `Logging and monitoring
<https://docs.openstack.org/operations-guide/ops-logging-monitoring.html>`__ in
the Operations Guide remain applicable. Logging and monitoring can be provided
on a per-site basis, and in a common centralized location.
When attempting to deploy logging and monitoring facilities to a centralized
location, care must be taken with the load placed on the inter-site networking
links
Management software
-------------------
Management software providing clustering, logging, monitoring, and alerting
details for a cloud environment is often used. This impacts and affects the
overall OpenStack cloud design, and must account for the additional resource
consumption such as CPU, RAM, storage, and network
bandwidth.
The inclusion of clustering software, such as Corosync or Pacemaker, is
primarily determined by the availability of the cloud infrastructure and
the complexity of supporting the configuration after it is deployed. The
`OpenStack High Availability Guide <https://docs.openstack.org/ha-guide/>`_
provides more details on the installation and configuration of Corosync
and Pacemaker, should these packages need to be included in the design.
Some other potential design impacts include:
* OS-hypervisor combination
Ensure that the selected logging, monitoring, or alerting tools support
the proposed OS-hypervisor combination.
* Network hardware
The network hardware selection needs to be supported by the logging,
monitoring, and alerting software.
Database software
-----------------
Most OpenStack components require access to back-end database services
to store state and configuration information. Choose an appropriate
back-end database which satisfies the availability and fault tolerance
requirements of the OpenStack services.
MySQL is the default database for OpenStack, but other compatible
databases are available.
.. note::
Telemetry uses MongoDB.
The chosen high availability database solution changes according to the
selected database. MySQL, for example, provides several options. Use a
replication technology such as Galera for active-active clustering. For
active-passive use some form of shared storage. Each of these potential
solutions has an impact on the design:
* Solutions that employ Galera/MariaDB require at least three MySQL
nodes.
* MongoDB has its own design considerations for high availability.
* OpenStack design, generally, does not include shared storage.
However, for some high availability designs, certain components might
require it depending on the specific implementation.
Operator access to systems
~~~~~~~~~~~~~~~~~~~~~~~~~~
There is a trend for cloud operations systems being hosted within the cloud
environment. Operators require access to these systems to resolve a major
incident.
Ensure that the network structure connects all clouds to form an integrated
system. Also consider the state of handoffs which must be reliable and have
minimal latency for optimal performance of the system.
If a significant portion of the cloud is on externally managed systems,
prepare for situations where it may not be possible to make changes.
Additionally, cloud providers may differ on how infrastructure must be managed
and exposed. This can lead to delays in root cause analysis where a provider
insists the blame lies with the other provider.

View File

@ -0,0 +1,230 @@
.. ## WARNING ##########################################################
.. This file is synced from openstack/openstack-manuals repository to
.. other related repositories. If you need to make changes to this file,
.. make the changes in openstack-manuals. After any change merged to,
.. openstack-manuals, automatically a patch for others will be proposed.
.. #####################################################################
=================
Community support
=================
The following resources are available to help you run and use OpenStack.
The OpenStack community constantly improves and adds to the main
features of OpenStack, but if you have any questions, do not hesitate to
ask. Use the following resources to get OpenStack support and
troubleshoot your installations.
Documentation
~~~~~~~~~~~~~
For the available OpenStack documentation, see
`docs.openstack.org <https://docs.openstack.org>`_.
The following guides explain how to install a Proof-of-Concept OpenStack cloud
and its associated components:
* `Rocky Installation Guides <https://docs.openstack.org/rocky/install/>`_
The following books explain how to configure and run an OpenStack cloud:
* `Architecture Design Guide <https://docs.openstack.org/arch-design/>`_
* `Rocky Administrator Guides <https://docs.openstack.org/rocky/admin/>`_
* `Rocky Configuration Guides <https://docs.openstack.org/rocky/configuration/>`_
* `Rocky Networking Guide <https://docs.openstack.org/neutron/rocky/admin/>`_
* `High Availability Guide <https://docs.openstack.org/ha-guide/>`_
* `Security Guide <https://docs.openstack.org/security-guide/>`_
* `Virtual Machine Image Guide <https://docs.openstack.org/image-guide/>`_
The following book explains how to use the command-line clients:
* `Rocky API Bindings
<https://docs.openstack.org/rocky/language-bindings.html>`_
The following documentation provides reference and guidance information
for the OpenStack APIs:
* `API Documentation <https://developer.openstack.org/api-guide/quick-start/>`_
The following guide provides information on how to contribute to OpenStack
documentation:
* `Documentation Contributor Guide <https://docs.openstack.org/doc-contrib-guide/>`_
ask.openstack.org
~~~~~~~~~~~~~~~~~
During the set up or testing of OpenStack, you might have questions
about how a specific task is completed or be in a situation where a
feature does not work correctly. Use the
`ask.openstack.org <https://ask.openstack.org>`_ site to ask questions
and get answers. When you visit the `Ask OpenStack
<https://ask.openstack.org>`_ site, scan
the recently asked questions to see whether your question has already
been answered. If not, ask a new question. Be sure to give a clear,
concise summary in the title and provide as much detail as possible in
the description. Paste in your command output or stack traces, links to
screen shots, and any other information which might be useful.
The OpenStack wiki
~~~~~~~~~~~~~~~~~~
The `OpenStack wiki <https://wiki.openstack.org/>`_ contains a broad
range of topics but some of the information can be difficult to find or
is a few pages deep. Fortunately, the wiki search feature enables you to
search by title or content. If you search for specific information, such
as about networking or OpenStack Compute, you can find a large amount
of relevant material. More is being added all the time, so be sure to
check back often. You can find the search box in the upper-right corner
of any OpenStack wiki page.
The Launchpad bugs area
~~~~~~~~~~~~~~~~~~~~~~~
The OpenStack community values your set up and testing efforts and wants
your feedback. To log a bug, you must `sign up for a Launchpad account
<https://launchpad.net/+login>`_. You can view existing bugs and report bugs
in the Launchpad Bugs area. Use the search feature to determine whether
the bug has already been reported or already been fixed. If it still
seems like your bug is unreported, fill out a bug report.
Some tips:
* Give a clear, concise summary.
* Provide as much detail as possible in the description. Paste in your
command output or stack traces, links to screen shots, and any other
information which might be useful.
* Be sure to include the software and package versions that you are
using, especially if you are using a development branch, such as,
``"Kilo release" vs git commit bc79c3ecc55929bac585d04a03475b72e06a3208``.
* Any deployment-specific information is helpful, such as whether you
are using Ubuntu 14.04 or are performing a multi-node installation.
The following Launchpad Bugs areas are available:
* `Bugs: OpenStack Block Storage
(cinder) <https://bugs.launchpad.net/cinder>`_
* `Bugs: OpenStack Compute (nova) <https://bugs.launchpad.net/nova>`_
* `Bugs: OpenStack Dashboard
(horizon) <https://bugs.launchpad.net/horizon>`_
* `Bugs: OpenStack Identity
(keystone) <https://bugs.launchpad.net/keystone>`_
* `Bugs: OpenStack Image service
(glance) <https://bugs.launchpad.net/glance>`_
* `Bugs: OpenStack Networking
(neutron) <https://bugs.launchpad.net/neutron>`_
* `Bugs: OpenStack Object Storage
(swift) <https://bugs.launchpad.net/swift>`_
* `Bugs: Application catalog (murano) <https://bugs.launchpad.net/murano>`_
* `Bugs: Bare metal service (ironic) <https://bugs.launchpad.net/ironic>`_
* `Bugs: Clustering service (senlin) <https://bugs.launchpad.net/senlin>`_
* `Bugs: Container Infrastructure Management service (magnum) <https://bugs.launchpad.net/magnum>`_
* `Bugs: Data processing service
(sahara) <https://bugs.launchpad.net/sahara>`_
* `Bugs: Database service (trove) <https://bugs.launchpad.net/trove>`_
* `Bugs: DNS service (designate) <https://bugs.launchpad.net/designate>`_
* `Bugs: Key Manager Service (barbican) <https://bugs.launchpad.net/barbican>`_
* `Bugs: Monitoring (monasca) <https://bugs.launchpad.net/monasca>`_
* `Bugs: Orchestration (heat) <https://bugs.launchpad.net/heat>`_
* `Bugs: Rating (cloudkitty) <https://bugs.launchpad.net/cloudkitty>`_
* `Bugs: Shared file systems (manila) <https://bugs.launchpad.net/manila>`_
* `Bugs: Telemetry
(ceilometer) <https://bugs.launchpad.net/ceilometer>`_
* `Bugs: Telemetry v3
(gnocchi) <https://bugs.launchpad.net/gnocchi>`_
* `Bugs: Workflow service
(mistral) <https://bugs.launchpad.net/mistral>`_
* `Bugs: Messaging service
(zaqar) <https://bugs.launchpad.net/zaqar>`_
* `Bugs: Container service
(zun) <https://bugs.launchpad.net/zun>`_
* `Bugs: OpenStack API Documentation
(developer.openstack.org) <https://bugs.launchpad.net/openstack-api-site>`_
* `Bugs: OpenStack Documentation
(docs.openstack.org) <https://bugs.launchpad.net/openstack-manuals>`_
Documentation feedback
~~~~~~~~~~~~~~~~~~~~~~
To provide feedback on documentation, join our IRC channel ``#openstack-doc``
on the Freenode IRC network, or `report a bug in Launchpad
<https://bugs.launchpad.net/openstack/+filebug>`_ and choose the particular
project that the documentation is a part of.
The OpenStack IRC channel
~~~~~~~~~~~~~~~~~~~~~~~~~
The OpenStack community lives in the #openstack IRC channel on the
Freenode network. You can hang out, ask questions, or get immediate
feedback for urgent and pressing issues. To install an IRC client or use
a browser-based client, go to
`https://webchat.freenode.net/ <https://webchat.freenode.net>`_. You can
also use `Colloquy <http://colloquy.info/>`_ (Mac OS X),
`mIRC <http://www.mirc.com/>`_ (Windows),
or XChat (Linux). When you are in the IRC channel
and want to share code or command output, the generally accepted method
is to use a Paste Bin. The OpenStack project has one at `Paste
<http://paste.openstack.org>`_. Just paste your longer amounts of text or
logs in the web form and you get a URL that you can paste into the
channel. The OpenStack IRC channel is ``#openstack`` on
``irc.freenode.net``. You can find a list of all OpenStack IRC channels on
the `IRC page on the wiki <https://wiki.openstack.org/wiki/IRC>`_.
OpenStack mailing lists
~~~~~~~~~~~~~~~~~~~~~~~
A great way to get answers and insights is to post your question or
problematic scenario to the OpenStack mailing list. You can learn from
and help others who might have similar issues. To subscribe or view the
archives, go to the `general OpenStack mailing list
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>`_. If you are
interested in the other mailing lists for specific projects or development,
refer to `Mailing Lists <https://wiki.openstack.org/wiki/Mailing_Lists>`_.
OpenStack distribution packages
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following Linux distributions provide community-supported packages
for OpenStack:
* **CentOS, Fedora, and Red Hat Enterprise Linux:**
https://www.rdoproject.org/
* **openSUSE and SUSE Linux Enterprise Server:**
https://en.opensuse.org/Portal:OpenStack
* **Ubuntu:** https://wiki.ubuntu.com/OpenStack/CloudArchive

View File

@ -0,0 +1,8 @@
Appendix
~~~~~~~~
.. toctree::
:maxdepth: 1
app-support.rst
glossary.rst

View File

@ -0,0 +1,47 @@
.. ## WARNING ##########################################################
.. This file is synced from openstack/openstack-manuals repository to
.. other related repositories. If you need to make changes to this file,
.. make the changes in openstack-manuals. After any change merged to,
.. openstack-manuals, automatically a patch for others will be proposed.
.. #####################################################################
===========
Conventions
===========
The OpenStack documentation uses several typesetting conventions.
Notices
~~~~~~~
Notices take these forms:
.. note:: A comment with additional information that explains a part of the
text.
.. important:: Something you must be aware of before proceeding.
.. tip:: An extra but helpful piece of practical advice.
.. caution:: Helpful information that prevents the user from making mistakes.
.. warning:: Critical information about the risk of data loss or security
issues.
Command prompts
~~~~~~~~~~~~~~~
.. code-block:: console
$ command
Any user, including the ``root`` user, can run commands that are
prefixed with the ``$`` prompt.
.. code-block:: console
# command
The ``root`` user must run commands that are prefixed with the ``#``
prompt. You can also prefix these commands with the :command:`sudo`
command, if available, to run them.

Binary file not shown.

After

Width:  |  Height:  |  Size: 220 KiB

File diff suppressed because it is too large Load Diff

After

Width:  |  Height:  |  Size: 120 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 185 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 765 KiB

File diff suppressed because it is too large Load Diff

307
doc/source/conf.py Normal file
View File

@ -0,0 +1,307 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This file is execfile()d with the current directory set to its
# containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.
import os
# import sys
import openstackdocstheme
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
# sys.path.insert(0, os.path.abspath('.'))
# -- General configuration ------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
# needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = ['openstackdocstheme']
# Add any paths that contain templates here, relative to this directory.
# templates_path = ['_templates']
# The suffix of source filenames.
source_suffix = '.rst'
# The encoding of source files.
# source_encoding = 'utf-8-sig'
# The master toctree document.
master_doc = 'index'
# General information about the project.
repository_name = "openstack/openstack-manuals"
bug_project = 'openstack-manuals'
project = u'Architecture Design Guide'
bug_tag = u'arch-design'
copyright = u'2015-2018, OpenStack contributors'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = ''
# The full version, including alpha/beta/rc tags.
release = ''
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
# language = None
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
# today = ''
# Else, today_fmt is used as the format for a strftime call.
# today_fmt = '%B %d, %Y'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ['common/cli*', 'common/nova*', 'common/get-started-*']
# The reST default role (used for this markup: `text`) to use for all
# documents.
# default_role = None
# If true, '()' will be appended to :func: etc. cross-reference text.
# add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
# add_module_names = True
# If true, sectionauthor and moduleauthor directives will be shown in the
# output. They are ignored by default.
# show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# A list of ignored prefixes for module index sorting.
# modindex_common_prefix = []
# If true, keep warnings as "system message" paragraphs in the built documents.
# keep_warnings = False
# -- Options for HTML output ----------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = 'openstackdocs'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
html_theme_options = {
'display_badge': False
}
# Add any paths that contain custom themes here, relative to this directory.
# html_theme_path = [openstackdocstheme.get_html_theme_path()]
# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
# html_title = None
# A shorter title for the navigation bar. Default is the same as html_title.
# html_short_title = None
# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
# html_logo = None
# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
# html_favicon = None
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
# html_static_path = []
# Add any extra paths that contain custom files (such as robots.txt or
# .htaccess) here, relative to this directory. These files are copied
# directly to the root of the documentation.
# html_extra_path = []
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
# So that we can enable "log-a-bug" links from each output HTML page, this
# variable must be set to a format that includes year, month, day, hours and
# minutes.
html_last_updated_fmt = '%Y-%m-%d %H:%M'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
# html_use_smartypants = True
# Custom sidebar templates, maps document names to template names.
# html_sidebars = {}
# Additional templates that should be rendered to pages, maps page names to
# template names.
# html_additional_pages = {}
# If false, no module index is generated.
# html_domain_indices = True
# If false, no index is generated.
html_use_index = False
# If true, the index is split into individual pages for each letter.
# html_split_index = False
# If true, links to the reST sources are added to the pages.
html_show_sourcelink = False
# If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
# html_show_sphinx = True
# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
# html_show_copyright = True
# If true, an OpenSearch description file will be output, and all pages will
# contain a <link> tag referring to it. The value of this option must be the
# base URL from which the finished HTML is served.
# html_use_opensearch = ''
# This is the file name suffix for HTML files (e.g. ".xhtml").
# html_file_suffix = None
# Output file base name for HTML help builder.
htmlhelp_basename = 'arch-design'
# If true, publish source files
html_copy_source = False
# -- Options for LaTeX output ---------------------------------------------
pdf_theme_path = openstackdocstheme.get_pdf_theme_path()
openstack_logo = openstackdocstheme.get_openstack_logo_path()
latex_custom_template = r"""
\newcommand{\openstacklogo}{%s}
\usepackage{%s}
""" % (openstack_logo, pdf_theme_path)
latex_engine = 'xelatex'
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
'papersize': 'a4paper',
# The font size ('10pt', '11pt' or '12pt').
'pointsize': '11pt',
#Default figure align
'figure_align': 'H',
# Not to generate blank page after chapter
'classoptions': ',openany',
# Additional stuff for the LaTeX preamble.
'preamble': latex_custom_template,
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
('index', 'ArchDesign.tex', u'Architecture Design Guide',
u'OpenStack contributors', 'manual'),
]
# The name of an image file (relative to this directory) to place at the top of
# the title page.
# latex_logo = None
# For "manual" documents, if this is true, then toplevel headings are parts,
# not chapters.
# latex_use_parts = False
# If true, show page references after internal links.
# latex_show_pagerefs = False
# If true, show URL addresses after external links.
# latex_show_urls = False
# Documents to append as an appendix to all manuals.
# latex_appendices = []
# If false, no module index is generated.
# latex_domain_indices = True
# -- Options for manual page output ---------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
('index', 'ArchDesign', u'Architecture Design Guide',
[u'OpenStack contributors'], 1)
]
# If true, show URL addresses after external links.
# man_show_urls = False
# -- Options for Texinfo output -------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
('index', 'ArchDesign', u'Architecture Design Guide',
u'OpenStack contributors', 'ArchDesign',
'To reap the benefits of OpenStack, you should plan, design,'
'and architect your cloud properly, taking user needs into'
'account and understanding the use cases.'
'commands.', 'Miscellaneous'),
]
# Documents to append as an appendix to all manuals.
# texinfo_appendices = []
# If false, no module index is generated.
# texinfo_domain_indices = True
# How to display URL addresses: 'footnote', 'no', or 'inline'.
# texinfo_show_urls = 'footnote'
# If true, do not generate a @detailmenu in the "Top" node's menu.
# texinfo_no_detailmenu = False
# -- Options for Internationalization output ------------------------------
locale_dirs = ['locale/']
# -- Options for PDF output --------------------------------------------------
pdf_documents = [
('index', u'ArchDesignGuide', u'Architecture Design Guide',
u'OpenStack contributors')
]

View File

@ -0,0 +1,49 @@
=============================
Cloud management architecture
=============================
Complex clouds, in particular hybrid clouds, may require tools to
facilitate working across multiple clouds.
Broker between clouds
Brokering software evaluates relative costs between different
cloud platforms. Cloud Management Platforms (CMP)
allow the designer to determine the right location for the
workload based on predetermined criteria.
Facilitate orchestration across the clouds
CMPs simplify the migration of application workloads between
public, private, and hybrid cloud platforms.
We recommend using cloud orchestration tools for managing a diverse
portfolio of systems and applications across multiple cloud platforms.
Technical details
~~~~~~~~~~~~~~~~~
.. TODO
Capacity and scale
~~~~~~~~~~~~~~~~~~
.. TODO
High availability
~~~~~~~~~~~~~~~~~
.. TODO
Operator requirements
~~~~~~~~~~~~~~~~~~~~~
.. TODO
Deployment considerations
~~~~~~~~~~~~~~~~~~~~~~~~~
.. TODO
Maintenance considerations
~~~~~~~~~~~~~~~~~~~~~~~~~~
.. TODO

View File

@ -0,0 +1,20 @@
====================
Compute architecture
====================
.. toctree::
:maxdepth: 3
design-compute/design-compute-arch
design-compute/design-compute-cpu
design-compute/design-compute-hypervisor
design-compute/design-compute-hardware
design-compute/design-compute-overcommit
design-compute/design-compute-storage
design-compute/design-compute-networking
design-compute/design-compute-logging
This section describes some of the choices you need to consider
when designing and building your compute nodes. Compute nodes form the
resource core of the OpenStack Compute cloud, providing the processing, memory,
network and storage resources to run instances.

View File

@ -0,0 +1,104 @@
====================================
Compute server architecture overview
====================================
When designing compute resource pools, consider the number of processors,
amount of memory, network requirements, the quantity of storage required for
each hypervisor, and any requirements for bare metal hosts provisioned
through ironic.
When architecting an OpenStack cloud, as part of the planning process, you
must not only determine what hardware to utilize but whether compute
resources will be provided in a single pool or in multiple pools or
availability zones. You should consider if the cloud will provide distinctly
different profiles for compute.
For example, CPU, memory or local storage based compute nodes. For NFV
or HPC based clouds, there may even be specific network configurations that
should be reserved for those specific workloads on specific compute nodes. This
method of designing specific resources into groups or zones of compute can be
referred to as bin packing.
.. note::
In a bin packing design, each independent resource pool provides service for
specific flavors. Since instances are scheduled onto compute hypervisors,
each independent node's resources will be allocated to efficiently use the
available hardware. While bin packing can separate workload specific
resources onto individual servers, bin packing also requires a common
hardware design, with all hardware nodes within a compute resource pool
sharing a common processor, memory, and storage layout. This makes it easier
to deploy, support, and maintain nodes throughout their lifecycle.
Increasing the size of the supporting compute environment increases the network
traffic and messages, adding load to the controllers and administrative
services used to support the OpenStack cloud or networking nodes. When
considering hardware for controller nodes, whether using the monolithic
controller design, where all of the controller services live on one or more
physical hardware nodes, or in any of the newer shared nothing control plane
models, adequate resources must be allocated and scaled to meet scale
requirements. Effective monitoring of the environment will help with capacity
decisions on scaling. Proper planning will help avoid bottlenecks and network
oversubscription as the cloud scales.
Compute nodes automatically attach to OpenStack clouds, resulting in a
horizontally scaling process when adding extra compute capacity to an
OpenStack cloud. To further group compute nodes and place nodes into
appropriate availability zones and host aggregates, additional work is
required. It is necessary to plan rack capacity and network switches as scaling
out compute hosts directly affects data center infrastructure resources as
would any other infrastructure expansion.
While not as common in large enterprises, compute host components can also be
upgraded to account for increases in
demand, known as vertical scaling. Upgrading CPUs with more
cores, or increasing the overall server memory, can add extra needed
capacity depending on whether the running applications are more CPU
intensive or memory intensive. We recommend a rolling upgrade of compute
nodes for redundancy and availability.
After the upgrade, when compute nodes return to the OpenStack cluster, they
will be re-scanned and the new resources will be discovered adjusted in the
OpenStack database.
When selecting a processor, compare features and performance
characteristics. Some processors include features specific to
virtualized compute hosts, such as hardware-assisted virtualization, and
technology related to memory paging (also known as EPT shadowing). These
types of features can have a significant impact on the performance of
your virtual machine.
The number of processor cores and threads impacts the number of worker
threads which can be run on a resource node. Design decisions must
relate directly to the service being run on it, as well as provide a
balanced infrastructure for all services.
Another option is to assess the average workloads and increase the
number of instances that can run within the compute environment by
adjusting the overcommit ratio. This ratio is configurable for CPU and
memory. The default CPU overcommit ratio is 16:1, and the default memory
overcommit ratio is 1.5:1. Determining the tuning of the overcommit
ratios during the design phase is important as it has a direct impact on
the hardware layout of your compute nodes.
.. note::
Changing the CPU overcommit ratio can have a detrimental effect
and cause a potential increase in a noisy neighbor.
Insufficient disk capacity could also have a negative effect on overall
performance including CPU and memory usage. Depending on the back end
architecture of the OpenStack Block Storage layer, capacity includes
adding disk shelves to enterprise storage systems or installing
additional Block Storage nodes. Upgrading directly attached storage
installed in Compute hosts, and adding capacity to the shared storage
for additional ephemeral storage to instances, may be necessary.
Consider the Compute requirements of non-hypervisor nodes (also referred to as
resource nodes). This includes controller, Object Storage nodes, Block Storage
nodes, and networking services.
The ability to create pools or availability zones for unpredictable workloads
should be considered. In some cases, the demand for certain instance types or
flavors may not justify individual hardware design. Allocate hardware designs
that are capable of servicing the most common instance requests. Adding
hardware to the overall architecture can be done later.

View File

@ -0,0 +1,85 @@
.. _choosing-a-cpu:
==============
Choosing a CPU
==============
The type of CPU in your compute node is a very important decision. You must
ensure that the CPU supports virtualization by way of *VT-x* for Intel chips
and *AMD-v* for AMD chips.
.. tip::
Consult the vendor documentation to check for virtualization support. For
Intel CPUs, see
`Does my processor support Intel® Virtualization Technology?
<https://www.intel.com/content/www/us/en/support/processors/000005486.html>`_. For AMD CPUs,
see `AMD Virtualization
<https://www.amd.com/en-us/innovations/software-technologies/server-solution/virtualization>`_.
Your CPU may support virtualization but it may be disabled. Consult your
BIOS documentation for how to enable CPU features.
The number of cores that the CPU has also affects your decision. It is
common for current CPUs to have up to 24 cores. Additionally, if an Intel CPU
supports hyper-threading, those 24 cores are doubled to 48 cores. If you
purchase a server that supports multiple CPUs, the number of cores is further
multiplied.
As of the Kilo release, key enhancements have been added to the
OpenStack code to improve guest performance. These improvements allow the
Compute service to take advantage of greater insight into a compute host's
physical layout and therefore make smarter decisions regarding workload
placement. Administrators can use this functionality to enable smarter planning
choices for use cases like NFV (Network Function Virtualization) and HPC (High
Performance Computing).
Considering non-uniform memory access (NUMA) is important when selecting CPU
sizes and types, as there are use cases that use NUMA pinning to reserve host
cores for operating system processes. These reduce the available CPU for
workloads and protects the operating system.
.. tip::
When CPU pinning is requested for a guest, it is assumed
there is no overcommit (or, an overcommit ratio of 1.0). When dedicated
resourcing is not requested for a workload, the normal overcommit ratios
are applied.
Therefore, we recommend that host aggregates are used to separate not
only bare metal hosts, but hosts that will provide resources for workloads
that require dedicated resources. This said, when workloads are provisioned
to NUMA host aggregates, NUMA nodes are chosen at random and vCPUs can float
across NUMA nodes on a host. If workloads require SR-IOV or DPDK, they should
be assigned to a NUMA node aggregate with hosts that supply the
functionality. More importantly, the workload or vCPUs that are executing
processes for a workload should be on the same NUMA node due to the limited
amount of cross-node memory bandwidth. In all cases, the ``NUMATopologyFilter``
must be enabled for ``nova-scheduler``.
Additionally, CPU selection may not be one-size-fits-all across enterprises,
but more of a list of SKUs that are tuned for the enterprise workloads.
For more information about NUMA, see `CPU topologies
<https://docs.openstack.org/admin-guide/compute-cpu-topologies.html>`_ in
the Administrator Guide.
In order to take advantage of these new enhancements in the Compute service,
compute hosts must be using NUMA capable CPUs.
.. tip::
**Multithread Considerations**
Hyper-Threading is Intel's proprietary simultaneous multithreading
implementation used to improve parallelization on their CPUs. You might
consider enabling Hyper-Threading to improve the performance of
multithreaded applications.
Whether you should enable Hyper-Threading on your CPUs depends upon your use
case. For example, disabling Hyper-Threading can be beneficial in intense
computing environments. We recommend performance testing with your local
workload with both Hyper-Threading on and off to determine what is more
appropriate in your case.
In most cases, hyper-threading CPUs can provide a 1.3x to 2.0x performance
benefit over non-hyper-threaded CPUs depending on types of workload.

View File

@ -0,0 +1,165 @@
========================
Choosing server hardware
========================
Consider the following factors when selecting compute server hardware:
* Server density
A measure of how many servers can fit into a given measure of
physical space, such as a rack unit [U].
* Resource capacity
The number of CPU cores, how much RAM, or how much storage a given
server delivers.
* Expandability
The number of additional resources you can add to a server before it
reaches capacity.
* Cost
The relative cost of the hardware weighed against the total amount of
capacity available on the hardware based on predetermined requirements.
Weigh these considerations against each other to determine the best design for
the desired purpose. For example, increasing server density means sacrificing
resource capacity or expandability. It also can decrease availability and
increase the chance of noisy neighbor issues. Increasing resource capacity and
expandability can increase cost but decrease server density. Decreasing cost
often means decreasing supportability, availability, server density, resource
capacity, and expandability.
Determine the requirements for the cloud prior to constructing the cloud,
and plan for hardware lifecycles, and expansion and new features that may
require different hardware.
If the cloud is initially built with near end of life, but cost effective
hardware, then the performance and capacity demand of new workloads will drive
the purchase of more modern hardware. With individual hardware components
changing over time, you may prefer to manage configurations as stock keeping
units (SKU)s. This method provides an enterprise with a standard
configuration unit of compute (server) that can be placed in any IT service
manager or vendor supplied ordering system that can be triggered manually or
through advanced operational automations. This simplifies ordering,
provisioning, and activating additional compute resources. For example, there
are plug-ins for several commercial service management tools that enable
integration with hardware APIs. These configure and activate new compute
resources from standby hardware based on a standard configurations. Using this
methodology, spare hardware can be ordered for a datacenter and provisioned
based on capacity data derived from OpenStack Telemetry.
Compute capacity (CPU cores and RAM capacity) is a secondary consideration for
selecting server hardware. The required server hardware must supply adequate
CPU sockets, additional CPU cores, and adequate RA. For more information, see
:ref:`choosing-a-cpu`.
In compute server architecture design, you must also consider network and
storage requirements. For more information on network considerations, see
:ref:`network-design`.
Considerations when choosing hardware
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here are some other factors to consider when selecting hardware for your
compute servers.
Instance density
----------------
More hosts are required to support the anticipated scale
if the design architecture uses dual-socket hardware designs.
For a general purpose OpenStack cloud, sizing is an important consideration.
The expected or anticipated number of instances that each hypervisor can
host is a common meter used in sizing the deployment. The selected server
hardware needs to support the expected or anticipated instance density.
Host density
------------
Another option to address the higher host count is to use a
quad-socket platform. Taking this approach decreases host density
which also increases rack count. This configuration affects the
number of power connections and also impacts network and cooling
requirements.
Physical data centers have limited physical space, power, and
cooling. The number of hosts (or hypervisors) that can be fitted
into a given metric (rack, rack unit, or floor tile) is another
important method of sizing. Floor weight is an often overlooked
consideration.
The data center floor must be able to support the weight of the proposed number
of hosts within a rack or set of racks. These factors need to be applied as
part of the host density calculation and server hardware selection.
Power and cooling density
-------------------------
The power and cooling density requirements might be lower than with
blade, sled, or 1U server designs due to lower host density (by
using 2U, 3U or even 4U server designs). For data centers with older
infrastructure, this might be a desirable feature.
Data centers have a specified amount of power fed to a given rack or
set of racks. Older data centers may have power densities as low as 20A per
rack, and current data centers can be designed to support power densities as
high as 120A per rack. The selected server hardware must take power density
into account.
Selecting hardware form factor
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider the following in selecting server hardware form factor suited for
your OpenStack design architecture:
* Most blade servers can support dual-socket multi-core CPUs. To avoid
this CPU limit, select ``full width`` or ``full height`` blades. Be
aware, however, that this also decreases server density. For example,
high density blade servers such as HP BladeSystem or Dell PowerEdge
M1000e support up to 16 servers in only ten rack units. Using
half-height blades is twice as dense as using full-height blades,
which results in only eight servers per ten rack units.
* 1U rack-mounted servers have the ability to offer greater server density
than a blade server solution, but are often limited to dual-socket,
multi-core CPU configurations. It is possible to place forty 1U servers
in a rack, providing space for the top of rack (ToR) switches, compared
to 32 full width blade servers.
To obtain greater than dual-socket support in a 1U rack-mount form
factor, customers need to buy their systems from Original Design
Manufacturers (ODMs) or second-tier manufacturers.
.. warning::
This may cause issues for organizations that have preferred
vendor policies or concerns with support and hardware warranties
of non-tier 1 vendors.
* 2U rack-mounted servers provide quad-socket, multi-core CPU support,
but with a corresponding decrease in server density (half the density
that 1U rack-mounted servers offer).
* Larger rack-mounted servers, such as 4U servers, often provide even
greater CPU capacity, commonly supporting four or even eight CPU
sockets. These servers have greater expandability, but such servers
have much lower server density and are often more expensive.
* ``Sled servers`` are rack-mounted servers that support multiple
independent servers in a single 2U or 3U enclosure. These deliver
higher density as compared to typical 1U or 2U rack-mounted servers.
For example, many sled servers offer four independent dual-socket
nodes in 2U for a total of eight CPU sockets in 2U.
Scaling your cloud
~~~~~~~~~~~~~~~~~~
When designing a OpenStack cloud compute server architecture, you must
decide whether you intend to scale up or scale out. Selecting a
smaller number of larger hosts, or a larger number of smaller hosts,
depends on a combination of factors: cost, power, cooling, physical rack
and floor space, support-warranty, and manageability. Typically, the scale out
model has been popular for OpenStack because it reduces the number of possible
failure domains by spreading workloads across more infrastructure.
However, the downside is the cost of additional servers and the datacenter
resources needed to power, network, and cool the servers.

View File

@ -0,0 +1,46 @@
======================
Choosing a hypervisor
======================
A hypervisor provides software to manage virtual machine access to the
underlying hardware. The hypervisor creates, manages, and monitors
virtual machines. OpenStack Compute (nova) supports many hypervisors to various
degrees, including:
* `Ironic <https://docs.openstack.org/ironic/latest/>`_
* `KVM <https://www.linux-kvm.org/page/Main_Page>`_
* `LXC <https://linuxcontainers.org/>`_
* `QEMU <https://wiki.qemu.org/Main_Page>`_
* `VMware ESX/ESXi <https://www.vmware.com/support/vsphere-hypervisor.html>`_
* `Xen (using libvirt) <https://www.xenproject.org>`_
* `XenServer <https://xenserver.org>`_
* `Hyper-V
<https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/hyper-v-technology-overview>`_
* `PowerVM <https://www.ibm.com/us-en/marketplace/ibm-powervm>`_
* `UML <http://user-mode-linux.sourceforge.net>`_
* `Virtuozzo <https://www.virtuozzo.com/products/vz7.html>`_
* `zVM <https://www.ibm.com/it-infrastructure/z/zvm>`_
An important factor in your choice of hypervisor is your current organization's
hypervisor usage or experience. Also important is the hypervisor's feature
parity, documentation, and the level of community experience.
As per the recent OpenStack user survey, KVM is the most widely adopted
hypervisor in the OpenStack community. Besides KVM, there are many deployments
that run other hypervisors such as LXC, VMware, Xen, and Hyper-V. However,
these hypervisors are either less used, are niche hypervisors, or have limited
functionality compared to more commonly used hypervisors.
.. note::
It is also possible to run multiple hypervisors in a single
deployment using host aggregates or cells. However, an individual
compute node can run only a single hypervisor at a time.
For more information about feature support for
hypervisors as well as ironic and Virtuozzo (formerly Parallels), see
`Hypervisor Support Matrix
<https://docs.openstack.org/nova/latest/user/support-matrix.html>`_
and `Hypervisors
<https://docs.openstack.org/ocata/config-reference/compute/hypervisors.html>`_
in the Configuration Reference.

View File

@ -0,0 +1,105 @@
======================
Compute server logging
======================
The logs on the compute nodes, or any server running nova-compute (for example
in a hyperconverged architecture), are the primary points for troubleshooting
issues with the hypervisor and compute services. Additionally, operating system
logs can also provide useful information.
As the cloud environment grows, the amount of log data increases exponentially.
Enabling debugging on either the OpenStack services or the operating system
further compounds the data issues.
Logging is described in more detail in the `Logging and Monitoring
<https://docs.openstack.org/operations-guide/ops-logging-monitoring.html>`_.
However, it is an important design consideration to take into account before
commencing operations of your cloud.
OpenStack produces a great deal of useful logging information, but for
the information to be useful for operations purposes, you should consider
having a central logging server to send logs to, and a log parsing/analysis
system such as Elastic Stack [formerly known as ELK].
Elastic Stack consists of mainly three components: Elasticsearch (log search
and analysis), Logstash (log intake, processing and output) and Kibana (log
dashboard service).
.. figure:: ../figures/ELKbasicArch.png
:align: center
:alt: Elastic Search Basic Architecture
Due to the amount of logs being sent from servers in the OpenStack environment,
an optional in-memory data structure store can be used. Common examples are
Redis and Memcached. In newer versions of Elastic Stack, a file buffer called
`Filebeat <https://www.elastic.co/products/beats/filebeat>`_ is used for a
similar purpose but adds a "backpressure-sensitive" protocol when sending data
to Logstash or Elasticsearch.
Log analysis often requires disparate logs of differing formats. Elastic
Stack (namely Logstash) was created to take many different log inputs and
transform them into a consistent format that Elasticsearch can catalog and
analyze. As seen in the image above, the process of ingestion starts on the
servers by Logstash, is forwarded to the Elasticsearch server for storage and
searching, and then displayed through Kibana for visual analysis and
interaction.
For instructions on installing Logstash, Elasticsearch and Kibana, see the
`Elasticsearch reference
<https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started.html>`_.
There are some specific configuration parameters that are needed to
configure Logstash for OpenStack. For example, in order to get Logstash to
collect, parse, and send the correct portions of log files to the Elasticsearch
server, you need to format the configuration file properly. There
are input, output and filter configurations. Input configurations tell Logstash
where to receive data from (log files/forwarders/filebeats/StdIn/Eventlog),
output configurations specify where to put the data, and filter configurations
define the input contents to forward to the output.
The Logstash filter performs intermediary processing on each event. Conditional
filters are applied based on the characteristics of the input and the event.
Some examples of filtering are:
* grok
* date
* csv
* json
There are also output filters available that send event data to many different
destinations. Some examples are:
* csv
* redis
* elasticsearch
* file
* jira
* nagios
* pagerduty
* stdout
Additionally there are several codecs that can be used to change the data
representation of events such as:
* collectd
* graphite
* json
* plan
* rubydebug
These input, output and filter configurations are typically stored in
:file:`/etc/logstash/conf.d` but may vary by linux distribution. Separate
configuration files should be created for different logging systems such as
syslog, Apache, and OpenStack.
General examples and configuration guides can be found on the Elastic `Logstash
Configuration page
<https://www.elastic.co/guide/en/logstash/current/configuration-file-structure.html>`_.
OpenStack input, output and filter examples can be found at
`sorantis/elkstack
<https://github.com/sorantis/elkstack/tree/master/elk/logstash>`_.
Once a configuration is complete, Kibana can be used as a visualization tool
for OpenStack and system logging. This will allow operators to configure custom
dashboards for performance, monitoring and security.

View File

@ -0,0 +1,51 @@
====================
Network connectivity
====================
The selected server hardware must have the appropriate number of network
connections, as well as the right type of network connections, in order to
support the proposed architecture. Ensure that, at a minimum, there are at
least two diverse network connections coming into each rack.
The selection of form factors or architectures affects the selection of server
hardware. Ensure that the selected server hardware is configured to support
enough storage capacity (or storage expandability) to match the requirements of
selected scale-out storage solution. Similarly, the network architecture
impacts the server hardware selection and vice versa.
While each enterprise install is different, the following networks with their
proposed bandwidth is highly recommended for a basic production OpenStack
install.
**Install or OOB network** - Typically used by most distributions and
provisioning tools as the network for deploying base software to the
OpenStack compute nodes. This network should be connected at a minimum of 1Gb
and no routing is usually needed.
**Internal or Management network** - Used as the internal communication network
between OpenStack compute and control nodes. Can also be used as a network
for iSCSI communication between the compute and iSCSI storage nodes. Again,
this should be a minimum of a 1Gb NIC and should be a non-routed network. This
interface should be redundant for high availability (HA).
**Tenant network** - A private network that enables communication between each
tenant's instances. If using flat networking and provider networks, this
network is optional. This network should also be isolated from all other
networks for security compliance. A 1Gb interface should be sufficient and
redundant for HA.
**Storage network** - A private network which could be connected to the Ceph
frontend or other shared storage. For HA purposes this should be a redundant
configuration with suggested 10Gb NICs. This network isolates the storage for
the instances away from other networks. Under load, this storage traffic
could overwhelm other networks and cause outages on other OpenStack services.
**(Optional) External or Public network** - This network is used to communicate
externally from the VMs to the public network space. These addresses are
typically handled by the neutron agent on the controller nodes and can also
be handled by a SDN other than neutron. However, when using neutron DVR with
OVS, this network must be present on the compute node since north and south
traffic will not be handled by the controller nodes, but by the compute node
itself. For more information on DVR with OVS and compute nodes, see
`Open vSwitch: High availability using DVR
<https://docs.openstack.org/ocata/networking-guide/deploy-ovs-ha-dvr.html>`_

View File

@ -0,0 +1,48 @@
==========================
Overcommitting CPU and RAM
==========================
OpenStack allows you to overcommit CPU and RAM on compute nodes. This
allows you to increase the number of instances running on your cloud at the
cost of reducing the performance of the instances. The Compute service uses the
following ratios by default:
* CPU allocation ratio: 16:1
* RAM allocation ratio: 1.5:1
The default CPU allocation ratio of 16:1 means that the scheduler
allocates up to 16 virtual cores per physical core. For example, if a
physical node has 12 cores, the scheduler sees 192 available virtual
cores. With typical flavor definitions of 4 virtual cores per instance,
this ratio would provide 48 instances on a physical node.
The formula for the number of virtual instances on a compute node is
``(OR*PC)/VC``, where:
OR
CPU overcommit ratio (virtual cores per physical core)
PC
Number of physical cores
VC
Number of virtual cores per instance
Similarly, the default RAM allocation ratio of 1.5:1 means that the
scheduler allocates instances to a physical node as long as the total
amount of RAM associated with the instances is less than 1.5 times the
amount of RAM available on the physical node.
For example, if a physical node has 48 GB of RAM, the scheduler
allocates instances to that node until the sum of the RAM associated
with the instances reaches 72 GB (such as nine instances, in the case
where each instance has 8 GB of RAM).
.. note::
Regardless of the overcommit ratio, an instance can not be placed
on any physical node with fewer raw (pre-overcommit) resources than
the instance flavor requires.
You must select the appropriate CPU and RAM allocation ratio for your
particular use case.

View File

@ -0,0 +1,154 @@
==========================
Instance storage solutions
==========================
As part of the architecture design for a compute cluster, you must specify
storage for the disk on which the instantiated instance runs. There are three
main approaches to providing temporary storage:
* Off compute node storage—shared file system
* On compute node storage—shared file system
* On compute node storage—nonshared file system
In general, the questions you should ask when selecting storage are as
follows:
* What are my workloads?
* Do my workloads have IOPS requirements?
* Are there read, write, or random access performance requirements?
* What is my forecast for the scaling of storage for compute?
* What storage is my enterprise currently using? Can it be re-purposed?
* How do I manage the storage operationally?
Many operators use separate compute and storage hosts instead of a
hyperconverged solution. Compute services and storage services have different
requirements, and compute hosts typically require more CPU and RAM than storage
hosts. Therefore, for a fixed budget, it makes sense to have different
configurations for your compute nodes and your storage nodes. Compute nodes
will be invested in CPU and RAM, and storage nodes will be invested in block
storage.
However, if you are more restricted in the number of physical hosts you have
available for creating your cloud and you want to be able to dedicate as many
of your hosts as possible to running instances, it makes sense to run compute
and storage on the same machines or use an existing storage array that is
available.
The three main approaches to instance storage are provided in the next
few sections.
Non-compute node based shared file system
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this option, the disks storing the running instances are hosted in
servers outside of the compute nodes.
If you use separate compute and storage hosts, you can treat your
compute hosts as "stateless". As long as you do not have any instances
currently running on a compute host, you can take it offline or wipe it
completely without having any effect on the rest of your cloud. This
simplifies maintenance for the compute hosts.
There are several advantages to this approach:
* If a compute node fails, instances are usually easily recoverable.
* Running a dedicated storage system can be operationally simpler.
* You can scale to any number of spindles.
* It may be possible to share the external storage for other purposes.
The main disadvantages to this approach are:
* Depending on design, heavy I/O usage from some instances can affect
unrelated instances.
* Use of the network can decrease performance.
* Scalability can be affected by network architecture.
On compute node storage—shared file system
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this option, each compute node is specified with a significant amount
of disk space, but a distributed file system ties the disks from each
compute node into a single mount.
The main advantage of this option is that it scales to external storage
when you require additional storage.
However, this option has several disadvantages:
* Running a distributed file system can make you lose your data
locality compared with nonshared storage.
* Recovery of instances is complicated by depending on multiple hosts.
* The chassis size of the compute node can limit the number of spindles
able to be used in a compute node.
* Use of the network can decrease performance.
* Loss of compute nodes decreases storage availability for all hosts.
On compute node storage—nonshared file system
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this option, each compute node is specified with enough disks to store the
instances it hosts.
There are two main advantages:
* Heavy I/O usage on one compute node does not affect instances on other
compute nodes. Direct I/O access can increase performance.
* Each host can have different storage profiles for hosts aggregation and
availability zones.
There are several disadvantages:
* If a compute node fails, the data associated with the instances running on
that node is lost.
* The chassis size of the compute node can limit the number of spindles
able to be used in a compute node.
* Migrations of instances from one node to another are more complicated
and rely on features that may not continue to be developed.
* If additional storage is required, this option does not scale.
Running a shared file system on a storage system apart from the compute nodes
is ideal for clouds where reliability and scalability are the most important
factors. Running a shared file system on the compute nodes themselves may be
best in a scenario where you have to deploy to pre-existing servers for which
you have little to no control over their specifications or have specific
storage performance needs but do not have a need for persistent storage.
Issues with live migration
--------------------------
Live migration is an integral part of the operations of the
cloud. This feature provides the ability to seamlessly move instances
from one physical host to another, a necessity for performing upgrades
that require reboots of the compute hosts, but only works well with
shared storage.
Live migration can also be done with non-shared storage, using a feature
known as *KVM live block migration*. While an earlier implementation of
block-based migration in KVM and QEMU was considered unreliable, there
is a newer, more reliable implementation of block-based live migration
as of the Mitaka release.
Live migration and block migration still have some issues:
* Error reporting has received some attention in Mitaka and Newton but there
are improvements needed.
* Live migration resource tracking issues.
* Live migration of rescued images.
Choice of file system
---------------------
If you want to support shared-storage live migration, you need to
configure a distributed file system.
Possible options include:
* NFS (default for Linux)
* Ceph
* GlusterFS
* MooseFS
* Lustre
We recommend that you choose the option operators are most familiar with.
NFS is the easiest to set up and there is extensive community knowledge
about it.

View File

@ -0,0 +1,413 @@
==========================
Control plane architecture
==========================
.. From Ops Guide chapter: Designing for Cloud Controllers and Cloud
Management
OpenStack is designed to be massively horizontally scalable, which
allows all services to be distributed widely. However, to simplify this
guide, we have decided to discuss services of a more central nature,
using the concept of a *cloud controller*. A cloud controller is a
conceptual simplification. In the real world, you design an architecture
for your cloud controller that enables high availability so that if any
node fails, another can take over the required tasks. In reality, cloud
controller tasks are spread out across more than a single node.
The cloud controller provides the central management system for
OpenStack deployments. Typically, the cloud controller manages
authentication and sends messaging to all the systems through a message
queue.
For many deployments, the cloud controller is a single node. However, to
have high availability, you have to take a few considerations into
account, which we'll cover in this chapter.
The cloud controller manages the following services for the cloud:
Databases
Tracks current information about users and instances, for example,
in a database, typically one database instance managed per service
Message queue services
All :term:`Advanced Message Queuing Protocol (AMQP)` messages for
services are received and sent according to the queue broker
Conductor services
Proxy requests to a database
Authentication and authorization for identity management
Indicates which users can do what actions on certain cloud
resources; quota management is spread out among services,
howeverauthentication
Image-management services
Stores and serves images with metadata on each, for launching in the
cloud
Scheduling services
Indicates which resources to use first; for example, spreading out
where instances are launched based on an algorithm
User dashboard
Provides a web-based front end for users to consume OpenStack cloud
services
API endpoints
Offers each service's REST API access, where the API endpoint
catalog is managed by the Identity service
For our example, the cloud controller has a collection of ``nova-*``
components that represent the global state of the cloud; talks to
services such as authentication; maintains information about the cloud
in a database; communicates to all compute nodes and storage
:term:`workers <worker>` through a queue; and provides API access.
Each service running on a designated cloud controller may be broken out
into separate nodes for scalability or availability.
As another example, you could use pairs of servers for a collective
cloud controller—one active, one standby—for redundant nodes providing a
given set of related services, such as:
- Front end web for API requests, the scheduler for choosing which
compute node to boot an instance on, Identity services, and the
dashboard
- Database and message queue server (such as MySQL, RabbitMQ)
- Image service for the image management
Now that you see the myriad designs for controlling your cloud, read
more about the further considerations to help with your design
decisions.
Hardware Considerations
~~~~~~~~~~~~~~~~~~~~~~~
A cloud controller's hardware can be the same as a compute node, though
you may want to further specify based on the size and type of cloud that
you run.
It's also possible to use virtual machines for all or some of the
services that the cloud controller manages, such as the message queuing.
In this guide, we assume that all services are running directly on the
cloud controller.
:ref:`table_controller_hardware` contains common considerations to
review when sizing hardware for the cloud controller design.
.. _table_controller_hardware:
.. list-table:: Table. Cloud controller hardware sizing considerations
:widths: 25 75
:header-rows: 1
* - Consideration
- Ramification
* - How many instances will run at once?
- Size your database server accordingly, and scale out beyond one cloud
controller if many instances will report status at the same time and
scheduling where a new instance starts up needs computing power.
* - How many compute nodes will run at once?
- Ensure that your messaging queue handles requests successfully and size
accordingly.
* - How many users will access the API?
- If many users will make multiple requests, make sure that the CPU load
for the cloud controller can handle it.
* - How many users will access the dashboard versus the REST API directly?
- The dashboard makes many requests, even more than the API access, so
add even more CPU if your dashboard is the main interface for your users.
* - How many ``nova-api`` services do you run at once for your cloud?
- You need to size the controller with a core per service.
* - How long does a single instance run?
- Starting instances and deleting instances is demanding on the compute
node but also demanding on the controller node because of all the API
queries and scheduling needs.
* - Does your authentication system also verify externally?
- External systems such as :term:`LDAP <Lightweight Directory Access
Protocol (LDAP)>` or :term:`Active Directory` require network
connectivity between the cloud controller and an external authentication
system. Also ensure that the cloud controller has the CPU power to keep
up with requests.
Separation of Services
~~~~~~~~~~~~~~~~~~~~~~
While our example contains all central services in a single location, it
is possible and indeed often a good idea to separate services onto
different physical servers. :ref:`table_deployment_scenarios` is a list
of deployment scenarios we've seen and their justifications.
.. _table_deployment_scenarios:
.. list-table:: Table. Deployment scenarios
:widths: 25 75
:header-rows: 1
* - Scenario
- Justification
* - Run ``glance-*`` servers on the ``swift-proxy`` server.
- This deployment felt that the spare I/O on the Object Storage proxy
server was sufficient and that the Image Delivery portion of glance
benefited from being on physical hardware and having good connectivity
to the Object Storage back end it was using.
* - Run a central dedicated database server.
- This deployment used a central dedicated server to provide the databases
for all services. This approach simplified operations by isolating
database server updates and allowed for the simple creation of slave
database servers for failover.
* - Run one VM per service.
- This deployment ran central services on a set of servers running KVM.
A dedicated VM was created for each service (``nova-scheduler``,
rabbitmq, database, etc). This assisted the deployment with scaling
because administrators could tune the resources given to each virtual
machine based on the load it received (something that was not well
understood during installation).
* - Use an external load balancer.
- This deployment had an expensive hardware load balancer in its
organization. It ran multiple ``nova-api`` and ``swift-proxy``
servers on different physical servers and used the load balancer
to switch between them.
One choice that always comes up is whether to virtualize. Some services,
such as ``nova-compute``, ``swift-proxy`` and ``swift-object`` servers,
should not be virtualized. However, control servers can often be happily
virtualized—the performance penalty can usually be offset by simply
running more of the service.
Database
~~~~~~~~
OpenStack Compute uses an SQL database to store and retrieve stateful
information. MySQL is the popular database choice in the OpenStack
community.
Loss of the database leads to errors. As a result, we recommend that you
cluster your database to make it failure tolerant. Configuring and
maintaining a database cluster is done outside OpenStack and is
determined by the database software you choose to use in your cloud
environment. MySQL/Galera is a popular option for MySQL-based databases.
Message Queue
~~~~~~~~~~~~~
Most OpenStack services communicate with each other using the *message
queue*. For example, Compute communicates to block storage services and
networking services through the message queue. Also, you can optionally
enable notifications for any service. RabbitMQ, Qpid, and Zeromq are all
popular choices for a message-queue service. In general, if the message
queue fails or becomes inaccessible, the cluster grinds to a halt and
ends up in a read-only state, with information stuck at the point where
the last message was sent. Accordingly, we recommend that you cluster
the message queue. Be aware that clustered message queues can be a pain
point for many OpenStack deployments. While RabbitMQ has native
clustering support, there have been reports of issues when running it at
a large scale. While other queuing solutions are available, such as Zeromq
and Qpid, Zeromq does not offer stateful queues. Qpid is the messaging
system of choice for Red Hat and its derivatives. Qpid does not have
native clustering capabilities and requires a supplemental service, such
as Pacemaker or Corsync. For your message queue, you need to determine
what level of data loss you are comfortable with and whether to use an
OpenStack project's ability to retry multiple MQ hosts in the event of a
failure, such as using Compute's ability to do so.
Conductor Services
~~~~~~~~~~~~~~~~~~
In the previous version of OpenStack, all ``nova-compute`` services
required direct access to the database hosted on the cloud controller.
This was problematic for two reasons: security and performance. With
regard to security, if a compute node is compromised, the attacker
inherently has access to the database. With regard to performance,
``nova-compute`` calls to the database are single-threaded and blocking.
This creates a performance bottleneck because database requests are
fulfilled serially rather than in parallel.
The conductor service resolves both of these issues by acting as a proxy
for the ``nova-compute`` service. Now, instead of ``nova-compute``
directly accessing the database, it contacts the ``nova-conductor``
service, and ``nova-conductor`` accesses the database on
``nova-compute``'s behalf. Since ``nova-compute`` no longer has direct
access to the database, the security issue is resolved. Additionally,
``nova-conductor`` is a nonblocking service, so requests from all
compute nodes are fulfilled in parallel.
.. note::
If you are using ``nova-network`` and multi-host networking in your
cloud environment, ``nova-compute`` still requires direct access to
the database.
The ``nova-conductor`` service is horizontally scalable. To make
``nova-conductor`` highly available and fault tolerant, just launch more
instances of the ``nova-conductor`` process, either on the same server
or across multiple servers.
Application Programming Interface (API)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All public access, whether direct, through a command-line client, or
through the web-based dashboard, uses the API service. Find the API
reference at `Development resources for OpenStack clouds
<https://developer.openstack.org/>`_.
You must choose whether you want to support the Amazon EC2 compatibility
APIs, or just the OpenStack APIs. One issue you might encounter when
running both APIs is an inconsistent experience when referring to images
and instances.
For example, the EC2 API refers to instances using IDs that contain
hexadecimal, whereas the OpenStack API uses names and digits. Similarly,
the EC2 API tends to rely on DNS aliases for contacting virtual
machines, as opposed to OpenStack, which typically lists IP
addresses.
If OpenStack is not set up in the right way, it is simple to have
scenarios in which users are unable to contact their instances due to
having only an incorrect DNS alias. Despite this, EC2 compatibility can
assist users migrating to your cloud.
As with databases and message queues, having more than one :term:`API server`
is a good thing. Traditional HTTP load-balancing techniques can be used to
achieve a highly available ``nova-api`` service.
Extensions
~~~~~~~~~~
The `API
Specifications <https://developer.openstack.org/api-guide/quick-start/index.html>`_ define
the core actions, capabilities, and mediatypes of the OpenStack API. A
client can always depend on the availability of this core API, and
implementers are always required to support it in its entirety.
Requiring strict adherence to the core API allows clients to rely upon a
minimal level of functionality when interacting with multiple
implementations of the same API.
The OpenStack Compute API is extensible. An extension adds capabilities
to an API beyond those defined in the core. The introduction of new
features, MIME types, actions, states, headers, parameters, and
resources can all be accomplished by means of extensions to the core
API. This allows the introduction of new features in the API without
requiring a version change and allows the introduction of
vendor-specific niche functionality.
Scheduling
~~~~~~~~~~
The scheduling services are responsible for determining the compute or
storage node where a virtual machine or block storage volume should be
created. The scheduling services receive creation requests for these
resources from the message queue and then begin the process of
determining the appropriate node where the resource should reside. This
process is done by applying a series of user-configurable filters
against the available collection of nodes.
There are currently two schedulers: ``nova-scheduler`` for virtual
machines and ``cinder-scheduler`` for block storage volumes. Both
schedulers are able to scale horizontally, so for high-availability
purposes, or for very large or high-schedule-frequency installations,
you should consider running multiple instances of each scheduler. The
schedulers all listen to the shared message queue, so no special load
balancing is required.
Images
~~~~~~
The OpenStack Image service consists of two parts: ``glance-api`` and
``glance-registry``. The former is responsible for the delivery of
images; the compute node uses it to download images from the back end.
The latter maintains the metadata information associated with virtual
machine images and requires a database.
The ``glance-api`` part is an abstraction layer that allows a choice of
back end. Currently, it supports:
OpenStack Object Storage
Allows you to store images as objects.
File system
Uses any traditional file system to store the images as files.
S3
Allows you to fetch images from Amazon S3.
HTTP
Allows you to fetch images from a web server. You cannot write
images by using this mode.
If you have an OpenStack Object Storage service, we recommend using this
as a scalable place to store your images. You can also use a file system
with sufficient performance or Amazon S3—unless you do not need the
ability to upload new images through OpenStack.
Dashboard
~~~~~~~~~
The OpenStack dashboard (horizon) provides a web-based user interface to
the various OpenStack components. The dashboard includes an end-user
area for users to manage their virtual infrastructure and an admin area
for cloud operators to manage the OpenStack environment as a
whole.
The dashboard is implemented as a Python web application that normally
runs in :term:`Apache` ``httpd``. Therefore, you may treat it the same as any
other web application, provided it can reach the API servers (including
their admin endpoints) over the network.
Authentication and Authorization
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The concepts supporting OpenStack's authentication and authorization are
derived from well-understood and widely used systems of a similar
nature. Users have credentials they can use to authenticate, and they
can be a member of one or more groups (known as projects or tenants,
interchangeably).
For example, a cloud administrator might be able to list all instances
in the cloud, whereas a user can see only those in his current group.
Resources quotas, such as the number of cores that can be used, disk
space, and so on, are associated with a project.
OpenStack Identity provides authentication decisions and user attribute
information, which is then used by the other OpenStack services to
perform authorization. The policy is set in the ``policy.json`` file.
For information on how to configure these, see `Managing Projects and Users
<https://docs.openstack.org/operations-guide/ops-projects-users.html>`_ in the
OpenStack Operations Guide.
OpenStack Identity supports different plug-ins for authentication
decisions and identity storage. Examples of these plug-ins include:
- In-memory key-value Store (a simplified internal storage structure)
- SQL database (such as MySQL or PostgreSQL)
- Memcached (a distributed memory object caching system)
- LDAP (such as OpenLDAP or Microsoft's Active Directory)
Many deployments use the SQL database; however, LDAP is also a popular
choice for those with existing authentication infrastructure that needs
to be integrated.
Network Considerations
~~~~~~~~~~~~~~~~~~~~~~
Because the cloud controller handles so many different services, it must
be able to handle the amount of traffic that hits it. For example, if
you choose to host the OpenStack Image service on the cloud controller,
the cloud controller should be able to support the transferring of the
images at an acceptable speed.
As another example, if you choose to use single-host networking where
the cloud controller is the network gateway for all instances, then the
cloud controller must support the total amount of traffic that travels
between your cloud and the public Internet.
We recommend that you use a fast NIC, such as 10 GB. You can also choose
to use two 10 GB NICs and bond them together. While you might not be
able to get a full bonded 20 GB speed, different transmission streams
use different NICs. For example, if the cloud controller transfers two
images, each image uses a different NIC and gets a full 10 GB of
bandwidth.

View File

@ -0,0 +1,3 @@
=====================
Identity architecture
=====================

View File

@ -0,0 +1,3 @@
==========================
Image Service architecture
==========================

View File

@ -0,0 +1,31 @@
.. _network-design:
====================
Network architecture
====================
.. toctree::
:maxdepth: 2
design-networking/design-networking-concepts
design-networking/design-networking-design
design-networking/design-networking-services
OpenStack provides a rich networking environment. This chapter
details the requirements and options to consider when designing your
cloud. This includes examples of network implementations to
consider, information about some OpenStack network layouts and networking
services that are essential for stable operation.
.. warning::
If this is the first time you are deploying a cloud infrastructure
in your organization, your first conversations should be with your
networking team. Network usage in a running cloud is vastly different
from traditional network deployments and has the potential to be
disruptive at both a connectivity and a policy level.
For example, you must plan the number of IP addresses that you need for
both your guest instances as well as management infrastructure.
Additionally, you must research and discuss cloud network connectivity
through proxy servers and firewalls.

View File

@ -0,0 +1,218 @@
===================
Networking concepts
===================
A cloud environment fundamentally changes the ways that networking is provided
and consumed. Understanding the following concepts and decisions is imperative
when making architectural decisions. For detailed information on networking
concepts, see the `OpenStack Networking Guide
<https://docs.openstack.org/ocata/networking-guide/>`_.
Network zones
~~~~~~~~~~~~~
The cloud networks are divided into a number of logical zones that support the
network traffic flow requirements. We recommend defining at the least four
distinct network zones.
Underlay
--------
The underlay zone is defined as the physical network switching infrastructure
that connects the storage, compute and control platforms. There are a large
number of potential underlay options available.
Overlay
-------
The overlay zone is defined as any L3 connectivity between the cloud components
and could take the form of SDN solutions such as the neutron overlay solution
or 3rd Party SDN solutions.
Edge
----
The edge zone is where network traffic transitions from the cloud overlay or
SDN networks into the traditional network environments.
External
--------
The external network is defined as the configuration and components that are
required to provide access to cloud resources and workloads, the external
network is defined as all the components outside of the cloud edge gateways.
Traffic flow
~~~~~~~~~~~~
There are two primary types of traffic flow within a cloud infrastructure, the
choice of networking technologies is influenced by the expected loads.
East/West - The internal traffic flow between workload within the cloud as well
as the traffic flow between the compute nodes and storage nodes falls into the
East/West category. Generally this is the heaviest traffic flow and due to the
need to cater for storage access needs to cater for a minimum of hops and low
latency.
North/South - The flow of traffic between the workload and all external
networks, including clients and remote services. This traffic flow is highly
dependant on the workload within the cloud and the type of network services
being offered.
Layer networking choices
~~~~~~~~~~~~~~~~~~~~~~~~
There are several factors to take into consideration when deciding on whether
to use Layer 2 networking architecture or a layer 3 networking architecture.
For more information about OpenStack networking concepts, see the
`OpenStack Networking <https://docs.openstack.org/ocata/networking-guide/intro-os-networking.html#>`_
section in the OpenStack Networking Guide.
Benefits using a Layer-2 network
--------------------------------
There are several reasons a network designed on layer-2 protocols is selected
over a network designed on layer-3 protocols. In spite of the difficulties of
using a bridge to perform the network role of a router, many vendors,
customers, and service providers choose to use Ethernet in as many parts of
their networks as possible. The benefits of selecting a layer-2 design are:
* Ethernet frames contain all the essentials for networking. These include, but
are not limited to, globally unique source addresses, globally unique
destination addresses, and error control.
* Ethernet frames can carry any kind of packet. Networking at layer-2 is
independent of the layer-3 protocol.
* Adding more layers to the Ethernet frame only slows the networking process
down. This is known as nodal processing delay.
* You can add adjunct networking features, for example class of service (CoS)
or multicasting, to Ethernet as readily as IP networks.
* VLANs are an easy mechanism for isolating networks.
Most information starts and ends inside Ethernet frames. Today this applies
to data, voice, and video. The concept is that the network will benefit more
from the advantages of Ethernet if the transfer of information from a source
to a destination is in the form of Ethernet frames.
Although it is not a substitute for IP networking, networking at layer-2 can
be a powerful adjunct to IP networking.
Layer-2 Ethernet usage has additional benefits over layer-3 IP network usage:
* Speed
* Reduced overhead of the IP hierarchy.
* No need to keep track of address configuration as systems move around.
Whereas the simplicity of layer-2 protocols might work well in a data center
with hundreds of physical machines, cloud data centers have the additional
burden of needing to keep track of all virtual machine addresses and
networks. In these data centers, it is not uncommon for one physical node
to support 30-40 instances.
.. Important::
Networking at the frame level says nothing about the presence or
absence of IP addresses at the packet level. Almost all ports, links, and
devices on a network of LAN switches still have IP addresses, as do all the
source and destination hosts. There are many reasons for the continued need
for IP addressing. The largest one is the need to manage the network. A
device or link without an IP address is usually invisible to most
management applications. Utilities including remote access for diagnostics,
file transfer of configurations and software, and similar applications
cannot run without IP addresses as well as MAC addresses.
Layer-2 architecture limitations
--------------------------------
Layer-2 network architectures have some limitations that become noticeable when
used outside of traditional data centers.
* Number of VLANs is limited to 4096.
* The number of MACs stored in switch tables is limited.
* You must accommodate the need to maintain a set of layer-4 devices to handle
traffic control.
* MLAG, often used for switch redundancy, is a proprietary solution that does
not scale beyond two devices and forces vendor lock-in.
* It can be difficult to troubleshoot a network without IP addresses and ICMP.
* Configuring ARP can be complicated on a large layer-2 networks.
* All network devices need to be aware of all MACs, even instance MACs, so
there is constant churn in MAC tables and network state changes as instances
start and stop.
* Migrating MACs (instance migration) to different physical locations are a
potential problem if you do not set ARP table timeouts properly.
It is important to know that layer-2 has a very limited set of network
management tools. It is difficult to control traffic as it does not have
mechanisms to manage the network or shape the traffic. Network
troubleshooting is also troublesome, in part because network devices have
no IP addresses. As a result, there is no reasonable way to check network
delay.
In a layer-2 network all devices are aware of all MACs, even those that belong
to instances. The network state information in the backbone changes whenever an
instance starts or stops. Because of this, there is far too much churn in the
MAC tables on the backbone switches.
Furthermore, on large layer-2 networks, configuring ARP learning can be
complicated. The setting for the MAC address timer on switches is critical
and, if set incorrectly, can cause significant performance problems. So when
migrating MACs to different physical locations to support instance migration,
problems may arise. As an example, the Cisco default MAC address timer is
extremely long. As such, the network information maintained in the switches
could be out of sync with the new location of the instance.
Benefits using a Layer-3 network
--------------------------------
In layer-3 networking, routing takes instance MAC and IP addresses out of the
network core, reducing state churn. The only time there would be a routing
state change is in the case of a Top of Rack (ToR) switch failure or a link
failure in the backbone itself. Other advantages of using a layer-3
architecture include:
* Layer-3 networks provide the same level of resiliency and scalability
as the Internet.
* Controlling traffic with routing metrics is straightforward.
* You can configure layer-3 to use Border Gateway Protocol (BGP) confederation
for scalability. This way core routers have state proportional to the number
of racks, not to the number of servers or instances.
* There are a variety of well tested tools, such as Internet Control Message
Protocol (ICMP) to monitor and manage traffic.
* Layer-3 architectures enable the use of :term:`quality of service (QoS)` to
manage network performance.
Layer-3 architecture limitations
--------------------------------
The main limitation of layer-3 networking is that there is no built-in
isolation mechanism comparable to the VLANs in layer-2 networks. Furthermore,
the hierarchical nature of IP addresses means that an instance is on the same
subnet as its physical host, making migration out of the subnet difficult. For
these reasons, network virtualization needs to use IP encapsulation and
software at the end hosts. This is for isolation and the separation of the
addressing in the virtual layer from the addressing in the physical layer.
Other potential disadvantages of layer-3 networking include the need to design
an IP addressing scheme rather than relying on the switches to keep track of
the MAC addresses automatically, and to configure the interior gateway routing
protocol in the switches.
Networking service (neutron)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
OpenStack Networking (neutron) is the component of OpenStack that provides
the Networking service API and a reference architecture that implements a
Software Defined Network (SDN) solution.
The Networking service provides full control over creation of virtual network
resources to tenants. This is often accomplished in the form of tunneling
protocols that establish encapsulated communication paths over existing
network infrastructure in order to segment tenant traffic. This method varies
depending on the specific implementation, but some of the more common methods
include tunneling over GRE, encapsulating with VXLAN, and VLAN tags.

View File

@ -0,0 +1,281 @@
==============================
Designing an OpenStack network
==============================
There are many reasons an OpenStack network has complex requirements. One main
factor is that many components interact at different levels of the system
stack. Data flows are also complex.
Data in an OpenStack cloud moves between instances across the network
(known as east-west traffic), as well as in and out of the system (known
as north-south traffic). Physical server nodes have network requirements that
are independent of instance network requirements and must be isolated to
account for scalability. We recommend separating the networks for security
purposes and tuning performance through traffic shaping.
You must consider a number of important technical and business requirements
when planning and designing an OpenStack network:
* Avoid hardware or software vendor lock-in. The design should not rely on
specific features of a vendor's network router or switch.
* Massively scale the ecosystem to support millions of end users.
* Support an indeterminate variety of platforms and applications.
* Design for cost efficient operations to take advantage of massive scale.
* Ensure that there is no single point of failure in the cloud ecosystem.
* High availability architecture to meet customer SLA requirements.
* Tolerant to rack level failure.
* Maximize flexibility to architect future production environments.
Considering these requirements, we recommend the following:
* Design a Layer-3 network architecture rather than a layer-2 network
architecture.
* Design a dense multi-path network core to support multi-directional
scaling and flexibility.
* Use hierarchical addressing because it is the only viable option to scale
a network ecosystem.
* Use virtual networking to isolate instance service network traffic from the
management and internal network traffic.
* Isolate virtual networks using encapsulation technologies.
* Use traffic shaping for performance tuning.
* Use External Border Gateway Protocol (eBGP) to connect to the Internet
up-link.
* Use Internal Border Gateway Protocol (iBGP) to flatten the internal traffic
on the layer-3 mesh.
* Determine the most effective configuration for block storage network.
Additional network design considerations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are several other considerations when designing a network-focused
OpenStack cloud.
Redundant networking
--------------------
You should conduct a high availability risk analysis to determine whether to
use redundant switches such as Top of Rack (ToR) switches. In most cases, it
is much more economical to use single switches with a small pool of spare
switches to replace failed units than it is to outfit an entire data center
with redundant switches. Applications should tolerate rack level outages
without affecting normal operations since network and compute resources are
easily provisioned and plentiful.
Research indicates the mean time between failures (MTBF) on switches is
between 100,000 and 200,000 hours. This number is dependent on the ambient
temperature of the switch in the data center. When properly cooled and
maintained, this translates to between 11 and 22 years before failure. Even
in the worst case of poor ventilation and high ambient temperatures in the data
center, the MTBF is still 2-3 years.
.. Link to research findings?
.. TODO Legacy networking (nova-network)
.. TODO OpenStack Networking
.. TODO Simple, single agent
.. TODO Complex, multiple agents
.. TODO Flat or VLAN
.. TODO Flat, VLAN, Overlays, L2-L3, SDN
.. TODO No plug-in support
.. TODO Plug-in support for 3rd parties
.. TODO No multi-tier topologies
.. TODO Multi-tier topologies
.. What about network security? (DC)
Providing IPv6 support
----------------------
One of the most important networking topics today is the exhaustion of
IPv4 addresses. As of late 2015, ICANN announced that the final
IPv4 address blocks have been fully assigned. Because of this, IPv6
protocol has become the future of network focused applications. IPv6
increases the address space significantly, fixes long standing issues
in the IPv4 protocol, and will become essential for network focused
applications in the future.
OpenStack Networking, when configured for it, supports IPv6. To enable
IPv6, create an IPv6 subnet in Networking and use IPv6 prefixes when
creating security groups.
Supporting asymmetric links
---------------------------
When designing a network architecture, the traffic patterns of an
application heavily influence the allocation of total bandwidth and
the number of links that you use to send and receive traffic. Applications
that provide file storage for customers allocate bandwidth and links to
favor incoming traffic; whereas video streaming applications allocate
bandwidth and links to favor outgoing traffic.
Optimizing network performance
------------------------------
It is important to analyze the applications tolerance for latency and
jitter when designing an environment to support network focused
applications. Certain applications, for example VoIP, are less tolerant
of latency and jitter. When latency and jitter are issues, certain
applications may require tuning of QoS parameters and network device
queues to ensure that they immediately queue for transmitting or guarantee
minimum bandwidth. Since OpenStack currently does not support these functions,
consider carefully your selected network plug-in.
The location of a service may also impact the application or consumer
experience. If an application serves differing content to different users,
it must properly direct connections to those specific locations. Where
appropriate, use a multi-site installation for these situations.
You can implement networking in two separate ways. Legacy networking
(nova-network) provides a flat DHCP network with a single broadcast domain.
This implementation does not support tenant isolation networks or advanced
plug-ins, but it is currently the only way to implement a distributed
layer-3 (L3) agent using the multi-host configuration. The Networking service
(neutron) is the official networking implementation and provides a pluggable
architecture that supports a large variety of network methods. Some of these
include a layer-2 only provider network model, external device plug-ins, or
even OpenFlow controllers.
Networking at large scales becomes a set of boundary questions. The
determination of how large a layer-2 domain must be is based on the
number of nodes within the domain and the amount of broadcast traffic
that passes between instances. Breaking layer-2 boundaries may require
the implementation of overlay networks and tunnels. This decision is a
balancing act between the need for a smaller overhead or a need for a smaller
domain.
When selecting network devices, be aware that making a decision based on the
greatest port density often comes with a drawback. Aggregation switches and
routers have not all kept pace with ToR switches and may induce
bottlenecks on north-south traffic. As a result, it may be possible for
massive amounts of downstream network utilization to impact upstream network
devices, impacting service to the cloud. Since OpenStack does not currently
provide a mechanism for traffic shaping or rate limiting, it is necessary to
implement these features at the network hardware level.
Using tunable networking components
-----------------------------------
Consider configurable networking components related to an OpenStack
architecture design when designing for network intensive workloads
that include MTU and QoS. Some workloads require a larger MTU than normal
due to the transfer of large blocks of data. When providing network
service for applications such as video streaming or storage replication,
we recommend that you configure both OpenStack hardware nodes and the
supporting network equipment for jumbo frames where possible. This
allows for better use of available bandwidth. Configure jumbo frames across the
complete path the packets traverse. If one network component is not capable of
handling jumbo frames then the entire path reverts to the default MTU.
:term:`Quality of Service (QoS)` also has a great impact on network intensive
workloads as it provides instant service to packets which have a higher
priority due to the impact of poor network performance. In applications such as
Voice over IP (VoIP), differentiated services code points are a near
requirement for proper operation. You can also use QoS in the opposite
direction for mixed workloads to prevent low priority but high bandwidth
applications, for example backup services, video conferencing, or file sharing,
from blocking bandwidth that is needed for the proper operation of other
workloads. It is possible to tag file storage traffic as a lower class, such as
best effort or scavenger, to allow the higher priority traffic through. In
cases where regions within a cloud might be geographically distributed it may
also be necessary to plan accordingly to implement WAN optimization to combat
latency or packet loss.
Choosing network hardware
~~~~~~~~~~~~~~~~~~~~~~~~~
The network architecture determines which network hardware will be
used. Networking software is determined by the selected networking
hardware.
There are more subtle design impacts that need to be considered. The
selection of certain networking hardware (and the networking software)
affects the management tools that can be used. There are exceptions to
this; the rise of *open* networking software that supports a range of
networking hardware means there are instances where the relationship
between networking hardware and networking software are not as tightly
defined.
Some of the key considerations in the selection of networking hardware
include:
Port count
The design will require networking hardware that has the requisite
port count.
Port density
The network design will be affected by the physical space that is
required to provide the requisite port count. A higher port density
is preferred, as it leaves more rack space for compute or storage
components. This can also lead into considerations about fault domains
and power density. Higher density switches are more expensive, therefore
it is important not to over design the network.
Port speed
The networking hardware must support the proposed network speed, for
example: 1 GbE, 10 GbE, or 40 GbE (or even 100 GbE).
Redundancy
User requirements for high availability and cost considerations
influence the level of network hardware redundancy. Network redundancy
can be achieved by adding redundant power supplies or paired switches.
.. note::
Hardware must support network redundancy.
Power requirements
Ensure that the physical data center provides the necessary power
for the selected network hardware.
.. note::
This is not an issue for top of rack (ToR) switches. This may be an issue
for spine switches in a leaf and spine fabric, or end of row (EoR)
switches.
Protocol support
It is possible to gain more performance out of a single storage
system by using specialized network technologies such as RDMA, SRP,
iSER and SCST. The specifics of using these technologies is beyond
the scope of this book.
There is no single best practice architecture for the networking
hardware supporting an OpenStack cloud. Some of the key factors that will
have a major influence on selection of networking hardware include:
Connectivity
All nodes within an OpenStack cloud require network connectivity. In
some cases, nodes require access to more than one network segment.
The design must encompass sufficient network capacity and bandwidth
to ensure that all communications within the cloud, both north-south
and east-west traffic, have sufficient resources available.
Scalability
The network design should encompass a physical and logical network
design that can be easily expanded upon. Network hardware should
offer the appropriate types of interfaces and speeds that are
required by the hardware nodes.
Availability
To ensure access to nodes within the cloud is not interrupted,
we recommend that the network architecture identifies any single
points of failure and provides some level of redundancy or fault
tolerance. The network infrastructure often involves use of
networking protocols such as LACP, VRRP or others to achieve a highly
available network connection. It is also important to consider the
networking implications on API availability. We recommend a load balancing
solution is designed within the network architecture to ensure that the APIs
and potentially other services in the cloud are highly available.
Choosing networking software
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
OpenStack Networking (neutron) provides a wide variety of networking
services for instances. There are many additional networking software
packages that can be useful when managing OpenStack components. Some
examples include:
- Software to provide load balancing
- Network redundancy protocols
- Routing daemons.
.. TODO Provide software examples

View File

@ -0,0 +1,70 @@
==============================
Additional networking services
==============================
OpenStack, like any network application, has a number of standard
services to consider, such as NTP and DNS.
NTP
~~~
Time synchronization is a critical element to ensure continued operation
of OpenStack components. Ensuring that all components have the correct
time is necessary to avoid errors in instance scheduling, replication of
objects in the object store, and matching log timestamps for debugging.
All servers running OpenStack components should be able to access an
appropriate NTP server. You may decide to set up one locally or use the
public pools available from the `Network Time Protocol
project <http://www.pool.ntp.org/>`_.
DNS
~~~
Designate is a multi-tenant DNSaaS service for OpenStack. It provides a REST
API with integrated keystone authentication. It can be configured to
auto-generate records based on nova and neutron actions. Designate supports a
variety of DNS servers including Bind9 and PowerDNS.
The DNS service provides DNS Zone and RecordSet management for OpenStack
clouds. The DNS Service includes a REST API, a command-line client, and a
horizon Dashboard plugin.
For more information, see the `Designate project <https://www.openstack.org/software/releases/ocata/components/designate>`_
web page.
.. note::
The Designate service does not provide DNS service for the OpenStack
infrastructure upon install. We recommend working with your service
provider when installing OpenStack in order to properly name your
servers and other infrastructure hardware.
DHCP
~~~~
OpenStack neutron deploys various agents when a network is created within
OpenStack. One of these agents is a DHCP agent. This DHCP agent uses the linux
binary, dnsmasq as the delivery agent for DHCP. This agent manages the network
namespaces that are spawned for each project subnet to act as a DHCP server.
The dnsmasq process is capable of allocating IP addresses to all virtual
machines running on a network. When a network is created through OpenStack and
the DHCP agent is enabled for that network, DHCP services are enabled by
default.
LBaaS
~~~~~
OpenStack neutron has the ability to distribute incoming requests between
designated instances. Using neutron networking and OVS, Load
Balancing-as-a-Service (LBaaS) can be created. The load balancing of workloads
is used to distribute incoming application requests evenly between designated
instances. This operation ensures that a workload is shared predictably among
defined instances and allows a more effective use of underlying resources.
OpenStack LBaaS can distribute load in the following methods:
* Round robin - Even rotation between multiple defined instances.
* Source IP - Requests from specific IPs are consistently directed to the same
instance.
* Least connections - Sends requests to the instance with the least number of
active connections.

View File

@ -0,0 +1,13 @@
====================
Storage architecture
====================
Storage is found in many parts of the OpenStack cloud environment. This
chapter describes storage type, design considerations and options when
selecting persistent storage options for your cloud environment.
.. toctree::
:maxdepth: 2
design-storage/design-storage-concepts
design-storage/design-storage-arch

View File

@ -0,0 +1,546 @@
====================
Storage architecture
====================
There are many different storage architectures available when designing an
OpenStack cloud. The convergence of orchestration and automation within the
OpenStack platform enables rapid storage provisioning without the hassle of
the traditional manual processes like volume creation and
attachment.
However, before choosing a storage architecture, a few generic questions should
be answered:
* Will the storage architecture scale linearly as the cloud grows and what are
its limits?
* What is the desired attachment method: NFS, iSCSI, FC, or other?
* Is the storage proven with the OpenStack platform?
* What is the level of support provided by the vendor within the community?
* What OpenStack features and enhancements does the cinder driver enable?
* Does it include tools to help troubleshoot and resolve performance issues?
* Is it interoperable with all of the projects you are planning on using
in your cloud?
Choosing storage back ends
~~~~~~~~~~~~~~~~~~~~~~~~~~
Users will indicate different needs for their cloud architecture. Some may
need fast access to many objects that do not change often, or want to
set a time-to-live (TTL) value on a file. Others may access only storage
that is mounted with the file system itself, but want it to be
replicated instantly when starting a new instance. For other systems,
ephemeral storage is the preferred choice. When you select
:term:`storage back ends <storage back end>`,
consider the following questions from user's perspective:
First and foremost:
* Do I need block storage?
* Do I need object storage?
* Do I need file-based storage?
Next answer the following:
* Do I need to support live migration?
* Should my persistent storage drives be contained in my compute nodes,
or should I use external storage?
* What type of performance do I need in regards to IOPS? Total IOPS and IOPS
per instance? Do I have applications with IOPS SLAs?
* Are my storage needs mostly read, or write, or mixed?
* Which storage choices result in the best cost-performance scenario I am
aiming for?
* How do I manage the storage operationally?
* How redundant and distributed is the storage? What happens if a
storage node fails? To what extent can it mitigate my data-loss disaster
scenarios?
* What is my company currently using and can I use it with OpenStack?
* Do I need more than one storage choice? Do I need tiered performance storage?
While this is not a definitive list of all the questions possible, the list
above will hopefully help narrow the list of possible storage choices down.
A wide variety of use case requirements dictate the nature of the storage
back end. Examples of such requirements are as follows:
* Public, private, or a hybrid cloud (performance profiles, shared storage,
replication options)
* Storage-intensive use cases like HPC and Big Data clouds
* Web-scale or development clouds where storage is typically ephemeral in
nature
Data security recommendations:
* We recommend that data be encrypted both in transit and at-rest.
To this end, carefully select disks, appliances, and software.
Do not assume these features are included with all storage solutions.
* Determine the security policy of your organization and understand
the data sovereignty of your cloud geography and plan accordingly.
If you plan to use live migration, we highly recommend a shared storage
configuration. This allows the operating system and application volumes
for instances to reside outside of the compute nodes and adds significant
performance increases when live migrating.
To deploy your storage by using only commodity hardware, you can use a number
of open-source packages, as described in :ref:`table_persistent_file_storage`.
.. _table_persistent_file_storage:
.. list-table:: Persistent file-based storage support
:widths: 25 25 25 25
:header-rows: 1
* -
- Object
- Block
- File-level
* - Swift
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
-
-
* - LVM
-
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
-
* - Ceph
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
- Experimental
* - Gluster
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
* - NFS
-
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
* - ZFS
-
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
-
* - Sheepdog
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
- .. image:: /figures/Check_mark_23x20_02.png
:width: 30%
-
This list of open source file-level shared storage solutions is not
exhaustive. Your organization may already have deployed a file-level shared
storage solution that you can use.
.. note::
**Storage driver support**
In addition to the open source technologies, there are a number of
proprietary solutions that are officially supported by OpenStack Block
Storage. You can find a matrix of the functionality provided by all of the
supported Block Storage drivers on the `CinderSupportMatrix
wiki <https://wiki.openstack.org/wiki/CinderSupportMatrix>`_.
Also, you need to decide whether you want to support object storage in
your cloud. The two common use cases for providing object storage in a
compute cloud are to provide:
* Users with a persistent storage mechanism for objects like images and video.
* A scalable, reliable data store for OpenStack virtual machine images.
* An API driven S3 compatible object store for application use.
Selecting storage hardware
~~~~~~~~~~~~~~~~~~~~~~~~~~
Storage hardware architecture is determined by selecting specific storage
architecture. Determine the selection of storage architecture by
evaluating possible solutions against the critical factors, the user
requirements, technical considerations, and operational considerations.
Consider the following factors when selecting storage hardware:
Cost
Storage can be a significant portion of the overall system cost. For
an organization that is concerned with vendor support, a commercial
storage solution is advisable, although it comes with a higher price
tag. If initial capital expenditure requires minimization, designing
a system based on commodity hardware would apply. The trade-off is
potentially higher support costs and a greater risk of
incompatibility and interoperability issues.
Performance
Performance of block based storage is typically measured in the maximum read
and write operations to non-contiguous storage locations per second. This
measurement typically applies to SAN, hard drives, and solid state drives.
While IOPS can be broadly measured and is not an official benchmark, many
vectors like to be used by vendors to communicate performance levels. Since
there are no real standards for measuring IOPS, vendor test results may vary,
sometimes wildly. However, along with transfer rate which measures the speed
that data can be transferred to contiguous storage locations, IOPS can be
used in a performance evaluation. Typically, transfer rate is represented by
a bytes per second calculation but IOPS is measured by an integer.
To calculate IOPS for a single drive you could use:
IOPS = 1 / (AverageLatency + AverageSeekTime)
For example:
Average Latency for Single Disk = 2.99ms or .00299 seconds
Average Seek Time for Single Disk = 4.7ms or .0047 seconds
IOPS = 1/(.00299 + .0047)
IOPS = 130
To calculate maximum IOPS for a disk array:
Maximum Read IOPS:
In order to accurately calculate maximum read IOPS for a disk array,
multiply the IOPS for each disk by the maximum read or write IOPS per disk.
maxReadIOPS = nDisks * diskMaxIOPS
For example, 15 10K Spinning Disks would be measured the following way:
maxReadIOPS = 15 * 130 maxReadIOPS = 1950
Maximum write IOPS per array:
Determining the maximum *write* IOPS is a little different because most
administrators configure disk replication using RAID and since the RAID
controller requires IOPS itself, there is a write penalty. The severity of
the write penalty is determined by the type of RAID used.
=========== ==========
Raid Type Penalty
----------- ----------
1 2
5 4
10 2
=========== ==========
.. note::
Raid 5 has the worst penalty (has the most cross disk writes.)
Therefore, when using the above examples, a 15 disk array using RAID 5 is
capable of 1950 read IOPS however, we need to add the penalty when
determining the *write* IOPS:
.. code-block:: text
maxWriteIOPS = 1950 / 4
maxWriteIOPS = 487.5
A RAID 5 array only has 25% of the write IOPS of the read IOPS while a RAID
1 array in this case would produce a maximum of 975 IOPS.
What about SSD? DRAM SSD?
In an HDD, data transfer is sequential. The actual read/write head "seeks" a
point in the hard drive to execute the operation. Seek time is significant.
Transfer rate can also be influenced by file system fragmentation and the
layout. Finally, the mechanical nature of hard disks also has certain
performance limitations.
In an SSD, data transfer is *not* sequential; it is random so it is faster.
There is consistent read performance because the physical location of data is
irrelevant because SSDs have no read/write heads and thus no delays due to
head motion (seeking).
.. note::
Some basic benchmarks for small read/writes:
- **HDDs**: Small reads 175 IOPs, Small writes 280 IOPs
- **Flash SSDs**: Small reads 1075 IOPs (6x), Small writes 21 IOPs (0.1x)
- **DRAM SSDs**: Small reads 4091 IOPs (23x), Small writes 4184 IOPs
(14x)
Scalability
Scalability, along with expandability, is a major consideration in
a general purpose OpenStack cloud. It might be difficult to predict the final
intended size of the implementation as there are no established usage patterns
for a general purpose cloud. It might become necessary to expand the initial
deployment in order to accommodate growth and user demand. Many vendors have
implemented their own solutions to this problem. Some use clustered file
systems that span multiple appliances, while others have similar technologies
to allow block storage to scale past a fixed capacity. Ceph, a distributed
storage solution that offers block storage, was designed to solve this scale
issue and does not have the same limitations on domains, clusters, or scale
issues of other appliance driven models.
Expandability
Expandability is a major architecture factor for storage solutions
with general purpose OpenStack cloud. A storage solution that
expands to 50 PB is considered more expandable than a solution that
only scales to 10 PB. This meter is related to scalability, which is
the measure of a solution's performance as it expands.
Implementing Block Storage
--------------------------
Configure Block Storage resource nodes with advanced RAID controllers
and high-performance disks to provide fault tolerance at the hardware
level.
We recommend deploying high performing storage solutions such as SSD
drives or flash storage systems for applications requiring additional
performance out of Block Storage devices.
In environments that place substantial demands on Block Storage, we
recommend using multiple storage pools. In this case, each pool of
devices should have a similar hardware design and disk configuration
across all hardware nodes in that pool. This allows for a design that
provides applications with access to a wide variety of Block Storage pools,
each with their own redundancy, availability, and performance
characteristics. When deploying multiple pools of storage, it is also
important to consider the impact on the Block Storage scheduler which is
responsible for provisioning storage across resource nodes. Ideally,
ensure that applications can schedule volumes in multiple regions, each with
their own network, power, and cooling infrastructure. This will give tenants
the option of building fault-tolerant applications that are distributed
across multiple availability zones.
In addition to the Block Storage resource nodes, it is important to
design for high availability and redundancy of the APIs, and related
services that are responsible for provisioning and providing access to
storage. We recommend designing a layer of hardware or software load
balancers in order to achieve high availability of the appropriate REST
API services to provide uninterrupted service. In some cases, it may
also be necessary to deploy an additional layer of load balancing to
provide access to back-end database services responsible for servicing
and storing the state of Block Storage volumes. It is imperative that a
highly available database cluster is used to store the Block Storage metadata.
In a cloud with significant demands on Block Storage, the network
architecture should take into account the amount of East-West bandwidth
required for instances to make use of the available storage resources.
The selected network devices should support jumbo frames for
transferring large blocks of data, and utilize a dedicated network for
providing connectivity between instances and Block Storage.
Implementing Object Storage
~~~~~~~~~~~~~~~~~~~~~~~~~~~
While consistency and partition tolerance are both inherent features of
the Object Storage service, it is important to design the overall
storage architecture to ensure that the implemented system meets those goals.
The OpenStack Object Storage service places a specific number of
data replicas as objects on resource nodes. Replicas are distributed
throughout the cluster, based on a consistent hash ring also stored on
each node in the cluster.
When designing your cluster, you must consider durability and
availability which is dependent on the spread and placement of your data,
rather than the reliability of the hardware.
Consider the default value of the number of replicas, which is three. This
means that before an object is marked as having been written, at least two
copies exist in case a single server fails to write, the third copy may or
may not yet exist when the write operation initially returns. Altering this
number increases the robustness of your data, but reduces the amount of
storage you have available. Look at the placement of your servers. Consider
spreading them widely throughout your data center's network and power-failure
zones. Is a zone a rack, a server, or a disk?
Consider these main traffic flows for an Object Storage network:
* Among :term:`object`, :term:`container`, and
:term:`account servers <account server>`
* Between servers and the proxies
* Between the proxies and your users
Object Storage frequent communicates among servers hosting data. Even a small
cluster generates megabytes per second of traffic.
Consider the scenario where an entire server fails and 24 TB of data
needs to be transferred "immediately" to remain at three copies — this can
put significant load on the network.
Another consideration is when a new file is being uploaded, the proxy server
must write out as many streams as there are replicas, multiplying network
traffic. For a three-replica cluster, 10 Gbps in means 30 Gbps out. Combining
this with the previous high bandwidth bandwidth private versus public network
recommendations demands of replication is what results in the recommendation
that your private network be of significantly higher bandwidth than your public
network requires. OpenStack Object Storage communicates internally with
unencrypted, unauthenticated rsync for performance, so the private
network is required.
The remaining point on bandwidth is the public-facing portion. The
``swift-proxy`` service is stateless, which means that you can easily
add more and use HTTP load-balancing methods to share bandwidth and
availability between them. More proxies means more bandwidth.
You should consider designing the Object Storage system with a sufficient
number of zones to provide quorum for the number of replicas defined. For
example, with three replicas configured in the swift cluster, the recommended
number of zones to configure within the Object Storage cluster in order to
achieve quorum is five. While it is possible to deploy a solution with
fewer zones, the implied risk of doing so is that some data may not be
available and API requests to certain objects stored in the cluster
might fail. For this reason, ensure you properly account for the number
of zones in the Object Storage cluster.
Each Object Storage zone should be self-contained within its own
availability zone. Each availability zone should have independent access
to network, power, and cooling infrastructure to ensure uninterrupted
access to data. In addition, a pool of Object Storage proxy servers
providing access to data stored on the object nodes should service each
availability zone. Object proxies in each region should leverage local
read and write affinity so that local storage resources facilitate
access to objects wherever possible. We recommend deploying upstream
load balancing to ensure that proxy services are distributed across the
multiple zones and, in some cases, it may be necessary to make use of
third-party solutions to aid with geographical distribution of services.
A zone within an Object Storage cluster is a logical division. Any of
the following may represent a zone:
* A disk within a single node
* One zone per node
* Zone per collection of nodes
* Multiple racks
* Multiple data centers
Selecting the proper zone design is crucial for allowing the Object
Storage cluster to scale while providing an available and redundant
storage system. It may be necessary to configure storage policies that
have different requirements with regards to replicas, retention, and
other factors that could heavily affect the design of storage in a
specific zone.
Planning and scaling storage capacity
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
An important consideration in running a cloud over time is projecting growth
and utilization trends in order to plan capital expenditures for the short and
long term. Gather utilization meters for compute, network, and storage, along
with historical records of these meters. While securing major anchor tenants
can lead to rapid jumps in the utilization of resources, the average rate of
adoption of cloud services through normal usage also needs to be carefully
monitored.
Scaling Block Storage
---------------------
You can upgrade Block Storage pools to add storage capacity without
interrupting the overall Block Storage service. Add nodes to the pool by
installing and configuring the appropriate hardware and software and
then allowing that node to report in to the proper storage pool through the
message bus. Block Storage nodes generally report into the scheduler
service advertising their availability. As a result, after the node is
online and available, tenants can make use of those storage resources
instantly.
In some cases, the demand on Block Storage may exhaust the available
network bandwidth. As a result, design network infrastructure that
services Block Storage resources in such a way that you can add capacity
and bandwidth easily. This often involves the use of dynamic routing
protocols or advanced networking solutions to add capacity to downstream
devices easily. Both the front-end and back-end storage network designs
should encompass the ability to quickly and easily add capacity and
bandwidth.
.. note::
Sufficient monitoring and data collection should be in-place
from the start, such that timely decisions regarding capacity,
input/output metrics (IOPS) or storage-associated bandwidth can
be made.
Scaling Object Storage
----------------------
Adding back-end storage capacity to an Object Storage cluster requires
careful planning and forethought. In the design phase, it is important
to determine the maximum partition power required by the Object Storage
service, which determines the maximum number of partitions which can
exist. Object Storage distributes data among all available storage, but
a partition cannot span more than one disk, so the maximum number of
partitions can only be as high as the number of disks.
For example, a system that starts with a single disk and a partition
power of 3 can have 8 (2^3) partitions. Adding a second disk means that
each has 4 partitions. The one-disk-per-partition limit means that this
system can never have more than 8 disks, limiting its scalability.
However, a system that starts with a single disk and a partition power
of 10 can have up to 1024 (2^10) disks.
As you add back-end storage capacity to the system, the partition maps
redistribute data amongst the storage nodes. In some cases, this
involves replication of extremely large data sets. In these cases, we
recommend using back-end replication links that do not contend with
tenants' access to data.
As more tenants begin to access data within the cluster and their data
sets grow, it is necessary to add front-end bandwidth to service data
access requests. Adding front-end bandwidth to an Object Storage cluster
requires careful planning and design of the Object Storage proxies that
tenants use to gain access to the data, along with the high availability
solutions that enable easy scaling of the proxy layer. We recommend
designing a front-end load balancing layer that tenants and consumers
use to gain access to data stored within the cluster. This load
balancing layer may be distributed across zones, regions or even across
geographic boundaries, which may also require that the design encompass
geo-location solutions.
In some cases, you must add bandwidth and capacity to the network
resources servicing requests between proxy servers and storage nodes.
For this reason, the network architecture used for access to storage
nodes and proxy servers should make use of a design which is scalable.
Redundancy
----------
When making swift more redundant, one approach is to add additional proxy
servers and load balancing. HAProxy is one method of providing load
balancing and high availability and is often combined with keepalived
or pacemaker to ensure the HAProxy service maintains a stable VIP.
Sample HAProxy configurations can be found in the `OpenStack HA Guide.
<https://docs.openstack.org/ha-guide/>`_.
Replication
-----------
Replicas in Object Storage function independently, and clients only
require a majority of nodes to respond to a request in order for an
operation to be considered successful. Thus, transient failures like
network partitions can quickly cause replicas to diverge.
Fix These differences are eventually reconciled by
asynchronous, peer-to-peer replicator processes. The replicator processes
traverse their local filesystems, concurrently performing operations in a
manner that balances load across physical disks.
Replication uses a push model, with records and files generally only being
copied from local to remote replicas. This is important because data on the
node may not belong there (as in the case of handoffs and ring changes), and a
replicator can not know what data exists elsewhere in the cluster that it
should pull in. It is the duty of any node that contains data to ensure that
data gets to where it belongs. Replica placement is handled by the ring.
Every deleted record or file in the system is marked by a tombstone, so that
deletions can be replicated alongside creations. The replication process cleans
up tombstones after a time period known as the consistency window. The
consistency window encompasses replication duration and the length of time a
transient failure can remove a node from the cluster. Tombstone cleanup must be
tied to replication to reach replica convergence.
If a replicator detects that a remote drive has failed, the replicator uses the
``get_more_nodes`` interface for the ring to choose an alternative node with
which to synchronize. The replicator can maintain desired levels of replication
in the face of disk failures, though some replicas may not be in an immediately
usable location.
.. note::
The replicator does not maintain desired levels of replication when other
failures occur, such as entire node failures, because most failures are
transient.
Replication is an area of active development, andimplementation details
are likely to change over time.
There are two major classes of replicator: the db replicator, which replicates
accounts and containers, and the object replicator, which replicates object
data.
For more information, please see the `Swift replication page <https://docs.openstack.org/swift/latest/overview_replication.html>`_.

View File

@ -0,0 +1,329 @@
================
Storage concepts
================
Storage is found in many parts of the OpenStack cloud environment. It is
important to understand the distinction between
:term:`ephemeral <ephemeral volume>` storage and
:term:`persistent <persistent volume>` storage:
- Ephemeral storage - If you only deploy OpenStack
:term:`Compute service (nova)`, by default your users do not have access to
any form of persistent storage. The disks associated with VMs are ephemeral,
meaning that from the user's point of view they disappear when a virtual
machine is terminated.
- Persistent storage - Persistent storage means that the storage resource
outlives any other resource and is always available, regardless of the state
of a running instance.
OpenStack clouds explicitly support three types of persistent
storage: *Object Storage*, *Block Storage*, and *File-based storage*.
Object storage
~~~~~~~~~~~~~~
Object storage is implemented in OpenStack by the
Object Storage service (swift). Users access binary objects through a REST API.
If your intended users need to archive or manage large datasets, you should
provide them with Object Storage service. Additional benefits include:
- OpenStack can store your virtual machine (VM) images inside of an Object
Storage system, as an alternative to storing the images on a file system.
- Integration with OpenStack Identity, and works with the OpenStack Dashboard.
- Better support for distributed deployments across multiple datacenters
through support for asynchronous eventual consistency replication.
You should consider using the OpenStack Object Storage service if you eventually
plan on distributing your storage cluster across multiple data centers, if you
need unified accounts for your users for both compute and object storage, or if
you want to control your object storage with the OpenStack Dashboard. For more
information, see the `Swift project page <https://www.openstack.org/software/releases/ocata/components/swift>`_.
Block storage
~~~~~~~~~~~~~
Block storage is implemented in OpenStack by the
Block Storage service (cinder). Because these volumes are
persistent, they can be detached from one instance and re-attached to another
instance and the data remains intact.
The Block Storage service supports multiple back ends in the form of drivers.
Your choice of a storage back end must be supported by a block storage
driver.
Most block storage drivers allow the instance to have direct access to
the underlying storage hardware's block device. This helps increase the
overall read/write IO. However, support for utilizing files as volumes
is also well established, with full support for NFS, GlusterFS and
others.
These drivers work a little differently than a traditional block
storage driver. On an NFS or GlusterFS file system, a single file is
created and then mapped as a virtual volume into the instance. This
mapping and translation is similar to how OpenStack utilizes QEMU's
file-based virtual machines stored in ``/var/lib/nova/instances``.
File-based storage
~~~~~~~~~~~~~~~~~~
In multi-tenant OpenStack cloud environment, the Shared File Systems service
(manila) provides a set of services for management of shared file systems. The
Shared File Systems service supports multiple back-ends in the form of drivers,
and can be configured to provision shares from one or more back-ends. Share
servers are virtual machines that export file shares using different file
system protocols such as NFS, CIFS, GlusterFS, or HDFS.
The Shared File Systems service is persistent storage and can be mounted to any
number of client machines. It can also be detached from one instance and
attached to another instance without data loss. During this process the data
are safe unless the Shared File Systems service itself is changed or removed.
Users interact with the Shared File Systems service by mounting remote file
systems on their instances with the following usage of those systems for
file storing and exchange. The Shared File Systems service provides shares
which is a remote, mountable file system. You can mount a share and access a
share from several hosts by several users at a time. With shares, you can also:
* Create a share specifying its size, shared file system protocol,
visibility level.
* Create a share on either a share server or standalone, depending on
the selected back-end mode, with or without using a share network.
* Specify access rules and security services for existing shares.
* Combine several shares in groups to keep data consistency inside the
groups for the following safe group operations.
* Create a snapshot of a selected share or a share group for storing
the existing shares consistently or creating new shares from that
snapshot in a consistent way.
* Create a share from a snapshot.
* Set rate limits and quotas for specific shares and snapshots.
* View usage of share resources.
* Remove shares.
Differences between storage types
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
:ref:`table_openstack_storage` explains the differences between Openstack
storage types.
.. _table_openstack_storage:
.. list-table:: Table. OpenStack storage
:widths: 20 20 20 20 20
:header-rows: 1
* -
- Ephemeral storage
- Block storage
- Object storage
- Shared File System storage
* - Application
- Run operating system and scratch space
- Add additional persistent storage to a virtual machine (VM)
- Store data, including VM images
- Add additional persistent storage to a virtual machine
* - Accessed through…
- A file system
- A block device that can be partitioned, formatted, and mounted
(such as, /dev/vdc)
- The REST API
- A Shared File Systems service share (either manila managed or an
external one registered in manila) that can be partitioned, formatted
and mounted (such as /dev/vdc)
* - Accessible from…
- Within a VM
- Within a VM
- Anywhere
- Within a VM
* - Managed by…
- OpenStack Compute (nova)
- OpenStack Block Storage (cinder)
- OpenStack Object Storage (swift)
- OpenStack Shared File System Storage (manila)
* - Persists until…
- VM is terminated
- Deleted by user
- Deleted by user
- Deleted by user
* - Sizing determined by…
- Administrator configuration of size settings, known as *flavors*
- User specification in initial request
- Amount of available physical storage
- * User specification in initial request
* Requests for extension
* Available user-level quotes
* Limitations applied by Administrator
* - Encryption configuration
- Parameter in ``nova.conf``
- Admin establishing `encrypted volume type
<https://docs.openstack.org/admin-guide/dashboard-manage-volumes.html>`_,
then user selecting encrypted volume
- Not yet available
- Shared File Systems service does not apply any additional encryption
above what the shares back-end storage provides
* - Example of typical usage…
- 10 GB first disk, 30 GB second disk
- 1 TB disk
- 10s of TBs of dataset storage
- Depends completely on the size of back-end storage specified when
a share was being created. In case of thin provisioning it can be
partial space reservation (for more details see
`Capabilities and Extra-Specs
<https://docs.openstack.org/manila/latest/contributor/capabilities_and_extra_specs.html#common-capabilities>`_
specification)
.. note::
**File-level storage for live migration**
With file-level storage, users access stored data using the operating
system's file system interface. Most users who have used a network
storage solution before have encountered this form of networked
storage. The most common file system protocol for Unix is NFS, and for
Windows, CIFS (previously, SMB).
OpenStack clouds do not present file-level storage to end users.
However, it is important to consider file-level storage for storing
instances under ``/var/lib/nova/instances`` when designing your cloud,
since you must have a shared file system if you want to support live
migration.
Commodity storage technologies
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are various commodity storage back end technologies available. Depending
on your cloud user's needs, you can implement one or many of these technologies
in different combinations.
Ceph
----
Ceph is a scalable storage solution that replicates data across commodity
storage nodes.
Ceph utilises and object storage mechanism for data storage and exposes
the data via different types of storage interfaces to the end user it
supports interfaces for:
- Object storage
- Block storage
- File-system interfaces
Ceph provides support for the same Object Storage API as swift and can
be used as a back end for the Block Storage service (cinder) as well as
back-end storage for glance images.
Ceph supports thin provisioning implemented using copy-on-write. This can
be useful when booting from volume because a new volume can be provisioned
very quickly. Ceph also supports keystone-based authentication (as of
version 0.56), so it can be a seamless swap in for the default OpenStack
swift implementation.
Ceph's advantages include:
- The administrator has more fine-grained control over data distribution and
replication strategies.
- Consolidation of object storage and block storage.
- Fast provisioning of boot-from-volume instances using thin provisioning.
- Support for the distributed file-system interface
`CephFS <http://ceph.com/docs/master/cephfs/>`_.
You should consider Ceph if you want to manage your object and block storage
within a single system, or if you want to support fast boot-from-volume.
Gluster
-------
A distributed shared file system. As of Gluster version 3.3, you
can use Gluster to consolidate your object storage and file storage
into one unified file and object storage solution, which is called
Gluster For OpenStack (GFO). GFO uses a customized version of swift
that enables Gluster to be used as the back-end storage.
The main reason to use GFO rather than swift is if you also
want to support a distributed file system, either to support shared
storage live migration or to provide it as a separate service to
your end users. If you want to manage your object and file storage
within a single system, you should consider GFO.
LVM
---
The Logical Volume Manager (LVM) is a Linux-based system that provides an
abstraction layer on top of physical disks to expose logical volumes
to the operating system. The LVM back-end implements block storage
as LVM logical partitions.
On each host that will house block storage, an administrator must
initially create a volume group dedicated to Block Storage volumes.
Blocks are created from LVM logical volumes.
.. note::
LVM does *not* provide any replication. Typically,
administrators configure RAID on nodes that use LVM as block
storage to protect against failures of individual hard drives.
However, RAID does not protect against a failure of the entire
host.
iSCSI
-----
Internet Small Computer Systems Interface (iSCSI) is a network protocol that
operates on top of the Transport Control Protocol (TCP) for linking data
storage devices. It transports data between an iSCSI initiator on a server
and iSCSI target on a storage device.
iSCSI is suitable for cloud environments with Block Storage service to support
applications or for file sharing systems. Network connectivity can be
achieved at a lower cost compared to other storage back end technologies since
iSCSI does not require host bus adaptors (HBA) or storage-specific network
devices.
.. Add tips? iSCSI traffic on a separate network or virtual vLAN?
NFS
---
Network File System (NFS) is a file system protocol that allows a user or
administrator to mount a file system on a server. File clients can access
mounted file systems through Remote Procedure Calls (RPC).
The benefits of NFS is low implementation cost due to shared NICs and
traditional network components, and a simpler configuration and setup process.
For more information on configuring Block Storage to use NFS storage, see
`Configure an NFS storage back end
<https://docs.openstack.org/admin-guide/blockstorage-nfs-backend.html>`_ in the
OpenStack Administrator Guide.
Sheepdog
--------
Sheepdog is a userspace distributed storage system. Sheepdog scales
to several hundred nodes, and has powerful virtual disk management
features like snapshot, cloning, rollback and thin provisioning.
It is essentially an object storage system that manages disks and
aggregates the space and performance of disks linearly in hyper
scale on commodity hardware in a smart way. On top of its object store,
Sheepdog provides elastic volume service and http service.
Sheepdog does require a specific kernel version and can work
nicely with xattr-supported file systems.
ZFS
---
The Solaris iSCSI driver for OpenStack Block Storage implements
blocks as ZFS entities. ZFS is a file system that also has the
functionality of a volume manager. This is unlike on a Linux system,
where there is a separation of volume manager (LVM) and file system
(such as, ext3, ext4, xfs, and btrfs). ZFS has a number of
advantages over ext4, including improved data-integrity checking.
The ZFS back end for OpenStack Block Storage supports only
Solaris-based systems, such as Illumos. While there is a Linux port
of ZFS, it is not included in any of the standard Linux
distributions, and it has not been tested with OpenStack Block
Storage. As with LVM, ZFS does not provide replication across hosts
on its own, you need to add a replication solution on top of ZFS if
your cloud needs to be able to handle storage-node failures.

50
doc/source/design.rst Normal file
View File

@ -0,0 +1,50 @@
.. _design:
======
Design
======
Designing an OpenStack cloud requires a understanding of the cloud user's
requirements and needs to determine the best possible configuration. This
chapter provides guidance on the decisions you need to make during the
design process.
To design, deploy, and configure OpenStack, administrators must
understand the logical architecture. OpenStack modules are one of the
following types:
Daemon
Runs as a background process. On Linux platforms, a daemon is usually
installed as a service.
Script
Installs a virtual environment and runs tests.
Command-line interface (CLI)
Enables users to submit API calls to OpenStack services through commands.
:ref:`logical_architecture` shows one example of the most common
integrated services within OpenStack and how they interact with each
other. End users can interact through the dashboard, CLIs, and APIs.
All services authenticate through a common Identity service, and
individual services interact with each other through public APIs, except
where privileged administrator commands are necessary.
.. _logical_architecture:
.. figure:: common/figures/osog_0001.png
:width: 100%
:alt: OpenStack Logical Architecture
OpenStack Logical Architecture
.. toctree::
:maxdepth: 2
design-compute.rst
design-storage.rst
design-networking.rst
design-identity.rst
design-images.rst
design-control-plane.rst
design-cmp-tools.rst

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.1 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 59 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

52
doc/source/index.rst Normal file
View File

@ -0,0 +1,52 @@
.. meta::
:description: This guide targets OpenStack Architects
for architectural design
:keywords: Architecture, OpenStack
===================================
OpenStack Architecture Design Guide
===================================
.. important::
**This guide is in the process of being updated after a period of
ownership transition. If you wish to update this guide, propose a
patch at your own leisure.**
This guide was last updated as of the Pike release, documenting
the OpenStack Ocata, Newton, and Mitaka releases. It may
not apply to EOL releases Kilo and Liberty.
We advise that you read this at your own discretion when planning
on your OpenStack cloud. This guide is intended as advice only.
Abstract
~~~~~~~~
The Architecture Design Guide provides information on planning and designing
an OpenStack cloud. It explains core concepts, cloud architecture design
requirements, and the design criteria of key components and services in an
OpenStack cloud. The guide also describes five common cloud use cases.
Before reading this book, we recommend:
* Prior knowledge of cloud architecture and principles.
* Linux and virtualization experience.
* A basic understanding of networking principles and protocols.
For information about deploying and operating OpenStack, see the
`Installation Guides <https://docs.openstack.org/ocata/install/>`_,
`Deployment Guides <https://docs.openstack.org/ocata/deploy/>`_,
and the `OpenStack Operations Guide <https://docs.openstack.org/operations-guide/>`_.
Contents
~~~~~~~~
.. toctree::
:maxdepth: 2
common/conventions.rst
arch-requirements.rst
design.rst
use-cases.rst
common/appendix.rst

File diff suppressed because it is too large Load Diff

14
doc/source/use-cases.rst Normal file
View File

@ -0,0 +1,14 @@
.. _use-cases:
=========
Use cases
=========
.. toctree::
:maxdepth: 2
use-cases/use-case-development
use-cases/use-case-general-compute
use-cases/use-case-web-scale
use-cases/use-case-storage
use-cases/use-case-nfv

View File

@ -0,0 +1,14 @@
.. _development-cloud:
=================
Development cloud
=================
Design model
~~~~~~~~~~~~
Requirements
~~~~~~~~~~~~
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~

View File

@ -0,0 +1,196 @@
.. _general-compute-cloud:
=====================
General compute cloud
=====================
Design model
~~~~~~~~~~~~
An online classified advertising company wants to run web applications
consisting of Tomcat, Nginx, and MariaDB in a private cloud. To meet the
policy requirements, the cloud infrastructure will run in their
own data center. The company has predictable load requirements but
requires scaling to cope with nightly increases in demand. Their current
environment does not have the flexibility to align with their goal of
running an open source API environment. The current environment consists
of the following:
* Between 120 and 140 installations of Nginx and Tomcat, each with 2
vCPUs and 4 GB of RAM
* A three node MariaDB and Galera cluster, each with 4 vCPUs and 8 GB
of RAM
The company runs hardware load balancers and multiple web applications
serving their websites and orchestrates environments using combinations
of scripts and Puppet. The website generates large amounts of log data
daily that requires archiving.
The solution would consist of the following OpenStack components:
* A firewall, switches and load balancers on the public facing network
connections.
* OpenStack Controller service running Image service, Identity service,
Networking service, combined with support services such as MariaDB and
RabbitMQ, configured for high availability on at least three controller
nodes.
* OpenStack compute nodes running the KVM hypervisor.
* OpenStack Block Storage for use by compute instances, requiring
persistent storage (such as databases for dynamic sites).
* OpenStack Object Storage for serving static objects (such as images).
.. figure:: ../figures/General_Architecture3.png
Running up to 140 web instances and the small number of MariaDB
instances requires 292 vCPUs available, as well as 584 GB of RAM. On a
typical 1U server using dual-socket hex-core Intel CPUs with
Hyperthreading, and assuming 2:1 CPU overcommit ratio, this would
require 8 OpenStack Compute nodes.
The web application instances run from local storage on each of the
OpenStack Compute nodes. The web application instances are stateless,
meaning that any of the instances can fail and the application will
continue to function.
MariaDB server instances store their data on shared enterprise storage,
such as NetApp or Solidfire devices. If a MariaDB instance fails,
storage would be expected to be re-attached to another instance and
rejoined to the Galera cluster.
Logs from the web application servers are shipped to OpenStack Object
Storage for processing and archiving.
Additional capabilities can be realized by moving static web content to
be served from OpenStack Object Storage containers, and backing the
OpenStack Image service with OpenStack Object Storage.
.. note::
Increasing OpenStack Object Storage means network bandwidth needs to
be taken into consideration. Running OpenStack Object Storage with
network connections offering 10 GbE or better connectivity is
advised.
Leveraging Orchestration and Telemetry services is also a potential
issue when providing auto-scaling, orchestrated web application
environments. Defining the web applications in a
:term:`Heat Orchestration Template (HOT)`
negates the reliance on the current scripted Puppet
solution.
OpenStack Networking can be used to control hardware load balancers
through the use of plug-ins and the Networking API. This allows users to
control hardware load balance pools and instances as members in these
pools, but their use in production environments must be carefully
weighed against current stability.
Requirements
~~~~~~~~~~~~
.. temporarily location of storage information until we establish a template
Storage requirements
--------------------
Using a scale-out storage solution with direct-attached storage (DAS) in
the servers is well suited for a general purpose OpenStack cloud. Cloud
services requirements determine your choice of scale-out solution. You
need to determine if a single, highly expandable and highly vertical,
scalable, centralized storage array is suitable for your design. After
determining an approach, select the storage hardware based on this
criteria.
This list expands upon the potential impacts for including a particular
storage architecture (and corresponding storage hardware) into the
design for a general purpose OpenStack cloud:
Connectivity
If storage protocols other than Ethernet are part of the storage solution,
ensure the appropriate hardware has been selected. If a centralized storage
array is selected, ensure that the hypervisor will be able to connect to
that storage array for image storage.
Usage
How the particular storage architecture will be used is critical for
determining the architecture. Some of the configurations that will
influence the architecture include whether it will be used by the
hypervisors for ephemeral instance storage, or if OpenStack Object
Storage will use it for object storage.
Instance and image locations
Where instances and images will be stored will influence the
architecture.
Server hardware
If the solution is a scale-out storage architecture that includes
DAS, it will affect the server hardware selection. This could ripple
into the decisions that affect host density, instance density, power
density, OS-hypervisor, management tools and others.
A general purpose OpenStack cloud has multiple options. The key factors
that will have an influence on selection of storage hardware for a
general purpose OpenStack cloud are as follows:
Capacity
Hardware resources selected for the resource nodes should be capable
of supporting enough storage for the cloud services. Defining the
initial requirements and ensuring the design can support adding
capacity is important. Hardware nodes selected for object storage
should be capable of support a large number of inexpensive disks
with no reliance on RAID controller cards. Hardware nodes selected
for block storage should be capable of supporting high speed storage
solutions and RAID controller cards to provide performance and
redundancy to storage at a hardware level. Selecting hardware RAID
controllers that automatically repair damaged arrays will assist
with the replacement and repair of degraded or deleted storage
devices.
Performance
Disks selected for object storage services do not need to be fast
performing disks. We recommend that object storage nodes take
advantage of the best cost per terabyte available for storage.
Contrastingly, disks chosen for block storage services should take
advantage of performance boosting features that may entail the use
of SSDs or flash storage to provide high performance block storage
pools. Storage performance of ephemeral disks used for instances
should also be taken into consideration.
Fault tolerance
Object storage resource nodes have no requirements for hardware
fault tolerance or RAID controllers. It is not necessary to plan for
fault tolerance within the object storage hardware because the
object storage service provides replication between zones as a
feature of the service. Block storage nodes, compute nodes, and
cloud controllers should all have fault tolerance built in at the
hardware level by making use of hardware RAID controllers and
varying levels of RAID configuration. The level of RAID chosen
should be consistent with the performance and availability
requirements of the cloud.
Network hardware requirements
-----------------------------
For a compute-focus architecture, we recommend designing the network
architecture using a scalable network model that makes it easy to add
capacity and bandwidth. A good example of such a model is the leaf-spine
model. In this type of network design, you can add additional
bandwidth as well as scale out to additional racks of gear. It is important to
select network hardware that supports port count, port speed, and
port density while allowing for future growth as workload demands
increase. In the network architecture, it is also important to evaluate
where to provide redundancy.
Network software requirements
-----------------------------
For a general purpose OpenStack cloud, the OpenStack infrastructure
components need to be highly available. If the design does not include
hardware load balancing, networking software packages like HAProxy will
need to be included.
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~

View File

@ -0,0 +1,181 @@
.. _nfv-cloud:
==============================
Network virtual function cloud
==============================
Design model
~~~~~~~~~~~~
Requirements
~~~~~~~~~~~~
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~
Network-focused cloud examples
------------------------------
An organization designs a large scale cloud-based web application. The
application scales horizontally in a bursting behavior and generates a
high instance count. The application requires an SSL connection to secure
data and must not lose connection state to individual servers.
The figure below depicts an example design for this workload. In this
example, a hardware load balancer provides SSL offload functionality and
connects to tenant networks in order to reduce address consumption. This
load balancer links to the routing architecture as it services the VIP
for the application. The router and load balancer use the GRE tunnel ID
of the application's tenant network and an IP address within the tenant
subnet but outside of the address pool. This is to ensure that the load
balancer can communicate with the application's HTTP servers without
requiring the consumption of a public IP address.
Because sessions persist until closed, the routing and switching
architecture provides high availability. Switches mesh to each
hypervisor and each other, and also provide an MLAG implementation to
ensure that layer-2 connectivity does not fail. Routers use VRRP and
fully mesh with switches to ensure layer-3 connectivity. Since GRE
provides an overlay network, Networking is present and uses the Open
vSwitch agent in GRE tunnel mode. This ensures all devices can reach all
other devices and that you can create tenant networks for private
addressing links to the load balancer.
.. figure:: ../figures/Network_Web_Services1.png
A web service architecture has many options and optional components. Due
to this, it can fit into a large number of other OpenStack designs. A
few key components, however, need to be in place to handle the nature of
most web-scale workloads. You require the following components:
* OpenStack Controller services (Image service, Identity service, Networking
service, and supporting services such as MariaDB and RabbitMQ)
* OpenStack Compute running KVM hypervisor
* OpenStack Object Storage
* Orchestration service
* Telemetry service
Beyond the normal Identity service, Compute service, Image service, and
Object Storage components, we recommend the Orchestration service component
to handle the proper scaling of workloads to adjust to demand. Due to the
requirement for auto-scaling, the design includes the Telemetry service.
Web services tend to be bursty in load, have very defined peak and
valley usage patterns and, as a result, benefit from automatic scaling
of instances based upon traffic. At a network level, a split network
configuration works well with databases residing on private tenant
networks since these do not emit a large quantity of broadcast traffic
and may need to interconnect to some databases for content.
Load balancing
~~~~~~~~~~~~~~
Load balancing spreads requests across multiple instances. This workload
scales well horizontally across large numbers of instances. This enables
instances to run without publicly routed IP addresses and instead to
rely on the load balancer to provide a globally reachable service. Many
of these services do not require direct server return. This aids in
address planning and utilization at scale since only the virtual IP
(VIP) must be public.
Overlay networks
~~~~~~~~~~~~~~~~
The overlay functionality design includes OpenStack Networking in Open
vSwitch GRE tunnel mode. In this case, the layer-3 external routers pair
with VRRP, and switches pair with an implementation of MLAG to ensure
that you do not lose connectivity with the upstream routing
infrastructure.
Performance tuning
~~~~~~~~~~~~~~~~~~
Network level tuning for this workload is minimal. :term:`Quality of
Service (QoS)` applies to these workloads for a middle ground Class
Selector depending on existing policies. It is higher than a best effort
queue but lower than an Expedited Forwarding or Assured Forwarding
queue. Since this type of application generates larger packets with
longer-lived connections, you can optimize bandwidth utilization for
long duration TCP. Normal bandwidth planning applies here with regards
to benchmarking a session's usage multiplied by the expected number of
concurrent sessions with overhead.
Network functions
~~~~~~~~~~~~~~~~~
Network functions is a broad category but encompasses workloads that
support the rest of a system's network. These workloads tend to consist
of large amounts of small packets that are very short lived, such as DNS
queries or SNMP traps. These messages need to arrive quickly and do not
deal with packet loss as there can be a very large volume of them. There
are a few extra considerations to take into account for this type of
workload and this can change a configuration all the way to the
hypervisor level. For an application that generates 10 TCP sessions per
user with an average bandwidth of 512 kilobytes per second per flow and
expected user count of ten thousand concurrent users, the expected
bandwidth plan is approximately 4.88 gigabits per second.
The supporting network for this type of configuration needs to have a
low latency and evenly distributed availability. This workload benefits
from having services local to the consumers of the service. Use a
multi-site approach as well as deploying many copies of the application
to handle load as close as possible to consumers. Since these
applications function independently, they do not warrant running
overlays to interconnect tenant networks. Overlays also have the
drawback of performing poorly with rapid flow setup and may incur too
much overhead with large quantities of small packets and therefore we do
not recommend them.
QoS is desirable for some workloads to ensure delivery. DNS has a major
impact on the load times of other services and needs to be reliable and
provide rapid responses. Configure rules in upstream devices to apply a
higher Class Selector to DNS to ensure faster delivery or a better spot
in queuing algorithms.
Cloud storage
~~~~~~~~~~~~~
Another common use case for OpenStack environments is providing a
cloud-based file storage and sharing service. You might consider this a
storage-focused use case, but its network-side requirements make it a
network-focused use case.
For example, consider a cloud backup application. This workload has two
specific behaviors that impact the network. Because this workload is an
externally-facing service and an internally-replicating application, it
has both :term:`north-south<north-south traffic>` and
:term:`east-west<east-west traffic>` traffic considerations:
north-south traffic
When a user uploads and stores content, that content moves into the
OpenStack installation. When users download this content, the
content moves out from the OpenStack installation. Because this
service operates primarily as a backup, most of the traffic moves
southbound into the environment. In this situation, it benefits you
to configure a network to be asymmetrically downstream because the
traffic that enters the OpenStack installation is greater than the
traffic that leaves the installation.
east-west traffic
Likely to be fully symmetric. Because replication originates from
any node and might target multiple other nodes algorithmically, it
is less likely for this traffic to have a larger volume in any
specific direction. However, this traffic might interfere with
north-south traffic.
.. figure:: ../figures/Network_Cloud_Storage2.png
This application prioritizes the north-south traffic over east-west
traffic: the north-south traffic involves customer-facing data.
The network design, in this case, is less dependent on availability and
more dependent on being able to handle high bandwidth. As a direct
result, it is beneficial to forgo redundant links in favor of bonding
those connections. This increases available bandwidth. It is also
beneficial to configure all devices in the path, including OpenStack, to
generate and pass jumbo frames.

View File

@ -0,0 +1,210 @@
.. _storage-cloud:
=============
Storage cloud
=============
Design model
~~~~~~~~~~~~
Storage-focused architecture depends on specific use cases. This section
discusses three example use cases:
* An object store with a RESTful interface
* Compute analytics with parallel file systems
* High performance database
An object store with a RESTful interface
----------------------------------------
The example below shows a REST interface without a high performance
requirement. The following diagram depicts the example architecture:
.. figure:: ../figures/Storage_Object.png
The example REST interface, presented as a traditional Object Store
running on traditional spindles, does not require a high performance
caching tier.
This example uses the following components:
Network:
* 10 GbE horizontally scalable spine leaf back-end storage and front
end network.
Storage hardware:
* 10 storage servers each with 12x4 TB disks equaling 480 TB total
space with approximately 160 TB of usable space after replicas.
Proxy:
* 3x proxies
* 2x10 GbE bonded front end
* 2x10 GbE back-end bonds
* Approximately 60 Gb of total bandwidth to the back-end storage
cluster
.. note::
It may be necessary to implement a third party caching layer for some
applications to achieve suitable performance.
Compute analytics with data processing service
----------------------------------------------
Analytics of large data sets are dependent on the performance of the
storage system. Clouds using storage systems such as Hadoop Distributed
File System (HDFS) have inefficiencies which can cause performance
issues.
One potential solution to this problem is the implementation of storage
systems designed for performance. Parallel file systems have previously
filled this need in the HPC space and are suitable for large scale
performance-orientated systems.
OpenStack has integration with Hadoop to manage the Hadoop cluster
within the cloud. The following diagram shows an OpenStack store with a
high performance requirement:
.. figure:: ../figures/Storage_Hadoop3.png
The hardware requirements and configuration are similar to those of the
High Performance Database example below. In this case, the architecture
uses Ceph's Swift-compatible REST interface, features that allow for
connecting a caching pool to allow for acceleration of the presented
pool.
High performance database with Database service
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Databases are a common workload that benefit from high performance
storage back ends. Although enterprise storage is not a requirement,
many environments have existing storage that OpenStack cloud can use as
back ends. You can create a storage pool to provide block devices with
OpenStack Block Storage for instances as well as object interfaces. In
this example, the database I-O requirements are high and demand storage
presented from a fast SSD pool.
A storage system presents a LUN backed by a set of SSDs using a
traditional storage array with OpenStack Block Storage integration or a
storage platform such as Ceph or Gluster.
This system can provide additional performance. For example, in the
database example below, a portion of the SSD pool can act as a block
device to the Database server. In the high performance analytics
example, the inline SSD cache layer accelerates the REST interface.
.. figure:: ../figures/Storage_Database_+_Object5.png
In this example, Ceph presents a swift-compatible REST interface, as
well as a block level storage from a distributed storage cluster. It is
highly flexible and has features that enable reduced cost of operations
such as self healing and auto balancing. Using erasure coded pools are a
suitable way of maximizing the amount of usable space.
.. note::
There are special considerations around erasure coded pools. For
example, higher computational requirements and limitations on the
operations allowed on an object; erasure coded pools do not support
partial writes.
Using Ceph as an applicable example, a potential architecture would have
the following requirements:
Network:
* 10 GbE horizontally scalable spine leaf back-end storage and
front-end network
Storage hardware:
* 5 storage servers for caching layer 24x1 TB SSD
* 10 storage servers each with 12x4 TB disks which equals 480 TB total
space with about approximately 160 TB of usable space after 3
replicas
REST proxy:
* 3x proxies
* 2x10 GbE bonded front end
* 2x10 GbE back-end bonds
* Approximately 60 Gb of total bandwidth to the back-end storage
cluster
Using an SSD cache layer, you can present block devices directly to
hypervisors or instances. The REST interface can also use the SSD cache
systems as an inline cache.
Requirements
~~~~~~~~~~~~
Storage requirements
--------------------
Storage-focused OpenStack clouds must address I/O intensive workloads.
These workloads are not CPU intensive, nor are they consistently network
intensive. The network may be heavily utilized to transfer storage, but
they are not otherwise network intensive.
The selection of storage hardware determines the overall performance and
scalability of a storage-focused OpenStack design architecture. Several
factors impact the design process, including:
Latency
A key consideration in a storage-focused OpenStack cloud is latency.
Using solid-state disks (SSDs) to minimize latency and, to reduce CPU
delays caused by waiting for the storage, increases performance. Use
RAID controller cards in compute hosts to improve the performance of the
underlying disk subsystem.
Scale-out solutions
Depending on the storage architecture, you can adopt a scale-out
solution, or use a highly expandable and scalable centralized storage
array. If a centralized storage array meets your requirements, then the
array vendor determines the hardware selection. It is possible to build
a storage array using commodity hardware with Open Source software, but
requires people with expertise to build such a system.
On the other hand, a scale-out storage solution that uses
direct-attached storage (DAS) in the servers may be an appropriate
choice. This requires configuration of the server hardware to support
the storage solution.
Considerations affecting storage architecture (and corresponding storage
hardware) of a Storage-focused OpenStack cloud include:
Connectivity
Ensure the connectivity matches the storage solution requirements. We
recommend confirming that the network characteristics minimize latency
to boost the overall performance of the design.
Latency
Determine if the use case has consistent or highly variable latency.
Throughput
Ensure that the storage solution throughput is optimized for your
application requirements.
Server hardware
Use of DAS impacts the server hardware choice and affects host
density, instance density, power density, OS-hypervisor, and
management tools.
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~

View File

@ -0,0 +1,14 @@
.. _web-scale-cloud:
===============
Web scale cloud
===============
Design model
~~~~~~~~~~~~
Requirements
~~~~~~~~~~~~
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~

16
tox.ini Normal file
View File

@ -0,0 +1,16 @@
[tox]
minversion = 2.0
skipsdist = True
envlist = docs
[testenv]
basepython = python3
setenv = VIRTUAL_ENV={envdir}
passenv = *_proxy *_PROXY ZUUL_*
[testenv:docs]
deps =
-r{toxinidir}/doc/requirements.txt
commands =
doc8 doc/source -e txt -e rst
sphinx-build -E -W -b html doc/source doc/build/html