[arch-design] Migrate cloud architecture examples
1. Migrate and tidy up cloud architecture examples from the current guide 2. Migrate figures 3. Add placeholder sections for new content Change-Id: I290f555f6e0cd4200deccb4d705127d99e61c343 Partial-Bug: #1548176 Implements: blueprint archguide-mitaka-reorg
126
doc/arch-design-draft/source/arch-examples-compute.rst
Normal file
@ -0,0 +1,126 @@
|
||||
=============================
|
||||
Compute-focused cloud example
|
||||
=============================
|
||||
|
||||
The Conseil Européen pour la Recherche Nucléaire (CERN), also known as
|
||||
the European Organization for Nuclear Research, provides particle
|
||||
accelerators and other infrastructure for high-energy physics research.
|
||||
|
||||
As of 2011 CERN operated these two compute centers in Europe with plans
|
||||
to add a third.
|
||||
|
||||
+-----------------------+------------------------+
|
||||
| Data center | Approximate capacity |
|
||||
+=======================+========================+
|
||||
| Geneva, Switzerland | - 3.5 Mega Watts |
|
||||
| | |
|
||||
| | - 91000 cores |
|
||||
| | |
|
||||
| | - 120 PB HDD |
|
||||
| | |
|
||||
| | - 100 PB Tape |
|
||||
| | |
|
||||
| | - 310 TB Memory |
|
||||
+-----------------------+------------------------+
|
||||
| Budapest, Hungary | - 2.5 Mega Watts |
|
||||
| | |
|
||||
| | - 20000 cores |
|
||||
| | |
|
||||
| | - 6 PB HDD |
|
||||
+-----------------------+------------------------+
|
||||
|
||||
To support a growing number of compute-heavy users of experiments
|
||||
related to the Large Hadron Collider (LHC), CERN ultimately elected to
|
||||
deploy an OpenStack cloud using Scientific Linux and RDO. This effort
|
||||
aimed to simplify the management of the center's compute resources with
|
||||
a view to doubling compute capacity through the addition of a data
|
||||
center in 2013 while maintaining the same levels of compute staff.
|
||||
|
||||
The CERN solution uses :term:`cells <cell>` for segregation of compute
|
||||
resources and for transparently scaling between different data centers.
|
||||
This decision meant trading off support for security groups and live
|
||||
migration. In addition, they must manually replicate some details, like
|
||||
flavors, across cells. In spite of these drawbacks cells provide the
|
||||
required scale while exposing a single public API endpoint to users.
|
||||
|
||||
CERN created a compute cell for each of the two original data centers
|
||||
and created a third when it added a new data center in 2013. Each cell
|
||||
contains three availability zones to further segregate compute resources
|
||||
and at least three RabbitMQ message brokers configured for clustering
|
||||
with mirrored queues for high availability.
|
||||
|
||||
The API cell, which resides behind a HAProxy load balancer, is in the
|
||||
data center in Switzerland and directs API calls to compute cells using
|
||||
a customized variation of the cell scheduler. The customizations allow
|
||||
certain workloads to route to a specific data center or all data
|
||||
centers, with cell RAM availability determining cell selection in the
|
||||
latter case.
|
||||
|
||||
.. figure:: figures/Generic_CERN_Example.png
|
||||
|
||||
There is also some customization of the filter scheduler that handles
|
||||
placement within the cells:
|
||||
|
||||
ImagePropertiesFilter
|
||||
Provides special handling depending on the guest operating system in
|
||||
use (Linux-based or Windows-based).
|
||||
|
||||
ProjectsToAggregateFilter
|
||||
Provides special handling depending on which project the instance is
|
||||
associated with.
|
||||
|
||||
default_schedule_zones
|
||||
Allows the selection of multiple default availability zones, rather
|
||||
than a single default.
|
||||
|
||||
A central database team manages the MySQL database server in each cell
|
||||
in an active/passive configuration with a NetApp storage back end.
|
||||
Backups run every 6 hours.
|
||||
|
||||
Network architecture
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
To integrate with existing networking infrastructure, CERN made
|
||||
customizations to legacy networking (nova-network). This was in the form
|
||||
of a driver to integrate with CERN's existing database for tracking MAC
|
||||
and IP address assignments.
|
||||
|
||||
The driver facilitates selection of a MAC address and IP for new
|
||||
instances based on the compute node where the scheduler places the
|
||||
instance.
|
||||
|
||||
The driver considers the compute node where the scheduler placed an
|
||||
instance and selects a MAC address and IP from the pre-registered list
|
||||
associated with that node in the database. The database updates to
|
||||
reflect the address assignment to that instance.
|
||||
|
||||
Storage architecture
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
CERN deploys the OpenStack Image service in the API cell and configures
|
||||
it to expose version 1 (V1) of the API. This also requires the image
|
||||
registry. The storage back end in use is a 3 PB Ceph cluster.
|
||||
|
||||
CERN maintains a small set of Scientific Linux 5 and 6 images onto which
|
||||
orchestration tools can place applications. Puppet manages instance
|
||||
configuration and customization.
|
||||
|
||||
Monitoring
|
||||
~~~~~~~~~~
|
||||
|
||||
CERN does not require direct billing, but uses the Telemetry service to
|
||||
perform metering for the purposes of adjusting project quotas. CERN uses
|
||||
a sharded, replicated, MongoDB back-end. To spread API load, CERN
|
||||
deploys instances of the nova-api service within the child cells for
|
||||
Telemetry to query against. This also requires the configuration of
|
||||
supporting services such as keystone, glance-api, and glance-registry in
|
||||
the child cells.
|
||||
|
||||
.. figure:: figures/Generic_CERN_Architecture.png
|
||||
|
||||
Additional monitoring tools in use include
|
||||
`Flume <http://flume.apache.org/>`_, `Elastic
|
||||
Search <http://www.elasticsearch.org/>`_,
|
||||
`Kibana <http://www.elasticsearch.org/overview/kibana/>`_, and the CERN
|
||||
developed `Lemon <http://lemon.web.cern.ch/lemon/index.shtml>`_
|
||||
project.
|
85
doc/arch-design-draft/source/arch-examples-general.rst
Normal file
@ -0,0 +1,85 @@
|
||||
=====================
|
||||
General cloud example
|
||||
=====================
|
||||
|
||||
An online classified advertising company wants to run web applications
|
||||
consisting of Tomcat, Nginx and MariaDB in a private cloud. To be able
|
||||
to meet policy requirements, the cloud infrastructure will run in their
|
||||
own data center. The company has predictable load requirements, but
|
||||
requires scaling to cope with nightly increases in demand. Their current
|
||||
environment does not have the flexibility to align with their goal of
|
||||
running an open source API environment. The current environment consists
|
||||
of the following:
|
||||
|
||||
* Between 120 and 140 installations of Nginx and Tomcat, each with 2
|
||||
vCPUs and 4 GB of RAM
|
||||
|
||||
* A three-node MariaDB and Galera cluster, each with 4 vCPUs and 8 GB
|
||||
RAM
|
||||
|
||||
The company runs hardware load balancers and multiple web applications
|
||||
serving their websites, and orchestrates environments using combinations
|
||||
of scripts and Puppet. The website generates large amounts of log data
|
||||
daily that requires archiving.
|
||||
|
||||
The solution would consist of the following OpenStack components:
|
||||
|
||||
* A firewall, switches and load balancers on the public facing network
|
||||
connections.
|
||||
|
||||
* OpenStack Controller service running Image, Identity, Networking,
|
||||
combined with support services such as MariaDB and RabbitMQ,
|
||||
configured for high availability on at least three controller nodes.
|
||||
|
||||
* OpenStack Compute nodes running the KVM hypervisor.
|
||||
|
||||
* OpenStack Block Storage for use by compute instances, requiring
|
||||
persistent storage (such as databases for dynamic sites).
|
||||
|
||||
* OpenStack Object Storage for serving static objects (such as images).
|
||||
|
||||
.. figure:: figures/General_Architecture3.png
|
||||
|
||||
Running up to 140 web instances and the small number of MariaDB
|
||||
instances requires 292 vCPUs available, as well as 584 GB RAM. On a
|
||||
typical 1U server using dual-socket hex-core Intel CPUs with
|
||||
Hyperthreading, and assuming 2:1 CPU overcommit ratio, this would
|
||||
require 8 OpenStack Compute nodes.
|
||||
|
||||
The web application instances run from local storage on each of the
|
||||
OpenStack Compute nodes. The web application instances are stateless,
|
||||
meaning that any of the instances can fail and the application will
|
||||
continue to function.
|
||||
|
||||
MariaDB server instances store their data on shared enterprise storage,
|
||||
such as NetApp or Solidfire devices. If a MariaDB instance fails,
|
||||
storage would be expected to be re-attached to another instance and
|
||||
rejoined to the Galera cluster.
|
||||
|
||||
Logs from the web application servers are shipped to OpenStack Object
|
||||
Storage for processing and archiving.
|
||||
|
||||
Additional capabilities can be realized by moving static web content to
|
||||
be served from OpenStack Object Storage containers, and backing the
|
||||
OpenStack Image service with OpenStack Object Storage.
|
||||
|
||||
.. note::
|
||||
|
||||
Increasing OpenStack Object Storage means network bandwidth needs to
|
||||
be taken into consideration. Running OpenStack Object Storage with
|
||||
network connections offering 10 GbE or better connectivity is
|
||||
advised.
|
||||
|
||||
Leveraging Orchestration and Telemetry services is also a potential
|
||||
issue when providing auto-scaling, orchestrated web application
|
||||
environments. Defining the web applications in a
|
||||
:term:`Heat Orchestration Template (HOT)`
|
||||
negates the reliance on the current scripted Puppet
|
||||
solution.
|
||||
|
||||
OpenStack Networking can be used to control hardware load balancers
|
||||
through the use of plug-ins and the Networking API. This allows users to
|
||||
control hardware load balance pools and instances as members in these
|
||||
pools, but their use in production environments must be carefully
|
||||
weighed against current stability.
|
||||
|
154
doc/arch-design-draft/source/arch-examples-hybrid.rst
Normal file
@ -0,0 +1,154 @@
|
||||
=====================
|
||||
Hybrid cloud examples
|
||||
=====================
|
||||
|
||||
Hybrid cloud environments are designed for these use cases:
|
||||
|
||||
* Bursting workloads from private to public OpenStack clouds
|
||||
* Bursting workloads from private to public non-OpenStack clouds
|
||||
* High availability across clouds (for technical diversity)
|
||||
|
||||
This chapter provides examples of environments that address
|
||||
each of these use cases.
|
||||
|
||||
Bursting to a public OpenStack cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Company A's data center is running low on capacity.
|
||||
It is not possible to expand the data center in the foreseeable future.
|
||||
In order to accommodate the continuously growing need for
|
||||
development resources in the organization,
|
||||
Company A decides to use resources in the public cloud.
|
||||
|
||||
Company A has an established data center with a substantial amount
|
||||
of hardware. Migrating the workloads to a public cloud is not feasible.
|
||||
|
||||
The company has an internal cloud management platform that directs
|
||||
requests to the appropriate cloud, depending on the local capacity.
|
||||
This is a custom in-house application written for this specific purpose.
|
||||
|
||||
This solution is depicted in the figure below:
|
||||
|
||||
.. figure:: figures/Multi-Cloud_Priv-Pub3.png
|
||||
:width: 100%
|
||||
|
||||
This example shows two clouds with a Cloud Management
|
||||
Platform (CMP) connecting them. This guide does not
|
||||
discuss a specific CMP, but describes how the Orchestration and
|
||||
Telemetry services handle, manage, and control workloads.
|
||||
|
||||
The private OpenStack cloud has at least one controller and at least
|
||||
one compute node. It includes metering using the Telemetry service.
|
||||
The Telemetry service captures the load increase and the CMP
|
||||
processes the information. If there is available capacity,
|
||||
the CMP uses the OpenStack API to call the Orchestration service.
|
||||
This creates instances on the private cloud in response to user requests.
|
||||
When capacity is not available on the private cloud, the CMP issues
|
||||
a request to the Orchestration service API of the public cloud.
|
||||
This creates the instance on the public cloud.
|
||||
|
||||
In this example, Company A does not direct the deployments to an
|
||||
external public cloud due to concerns regarding resource control,
|
||||
security, and increased operational expense.
|
||||
|
||||
Bursting to a public non-OpenStack cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The second example examines bursting workloads from the private cloud
|
||||
into a non-OpenStack public cloud using Amazon Web Services (AWS)
|
||||
to take advantage of additional capacity and to scale applications.
|
||||
|
||||
The following diagram demonstrates an OpenStack-to-AWS hybrid cloud:
|
||||
|
||||
.. figure:: figures/Multi-Cloud_Priv-AWS4.png
|
||||
:width: 100%
|
||||
|
||||
Company B states that its developers are already using AWS
|
||||
and do not want to change to a different provider.
|
||||
|
||||
If the CMP is capable of connecting to an external cloud
|
||||
provider with an appropriate API, the workflow process remains
|
||||
the same as the previous scenario.
|
||||
The actions the CMP takes, such as monitoring loads and
|
||||
creating new instances, stay the same.
|
||||
However, the CMP performs actions in the public cloud
|
||||
using applicable API calls.
|
||||
|
||||
If the public cloud is AWS, the CMP would use the
|
||||
EC2 API to create a new instance and assign an Elastic IP.
|
||||
It can then add that IP to HAProxy in the private cloud.
|
||||
The CMP can also reference AWS-specific
|
||||
tools such as CloudWatch and CloudFormation.
|
||||
|
||||
Several open source tool kits for building CMPs are
|
||||
available and can handle this kind of translation.
|
||||
Examples include ManageIQ, jClouds, and JumpGate.
|
||||
|
||||
High availability and disaster recovery
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Company C requires their local data center to be able to
|
||||
recover from failure. Some of the workloads currently in
|
||||
use are running on their private OpenStack cloud.
|
||||
Protecting the data involves Block Storage, Object Storage,
|
||||
and a database. The architecture supports the failure of
|
||||
large components of the system while ensuring that the
|
||||
system continues to deliver services.
|
||||
While the services remain available to users, the failed
|
||||
components are restored in the background based on standard
|
||||
best practice data replication policies.
|
||||
To achieve these objectives, Company C replicates data to
|
||||
a second cloud in a geographically distant location.
|
||||
The following diagram describes this system:
|
||||
|
||||
.. figure:: figures/Multi-Cloud_failover2.png
|
||||
:width: 100%
|
||||
|
||||
This example includes two private OpenStack clouds connected with a CMP.
|
||||
The source cloud, OpenStack Cloud 1, includes a controller and
|
||||
at least one instance running MySQL. It also includes at least
|
||||
one Block Storage volume and one Object Storage volume.
|
||||
This means that data is available to the users at all times.
|
||||
The details of the method for protecting each of these sources
|
||||
of data differs.
|
||||
|
||||
Object Storage relies on the replication capabilities of
|
||||
the Object Storage provider.
|
||||
Company C enables OpenStack Object Storage so that it creates
|
||||
geographically separated replicas that take advantage of this feature.
|
||||
The company configures storage so that at least one replica
|
||||
exists in each cloud. In order to make this work, the company
|
||||
configures a single array spanning both clouds with OpenStack Identity.
|
||||
Using Federated Identity, the array talks to both clouds, communicating
|
||||
with OpenStack Object Storage through the Swift proxy.
|
||||
|
||||
For Block Storage, the replication is a little more difficult,
|
||||
and involves tools outside of OpenStack itself.
|
||||
The OpenStack Block Storage volume is not set as the drive itself
|
||||
but as a logical object that points to a physical back end.
|
||||
Disaster recovery is configured for Block Storage for
|
||||
synchronous backup for the highest level of data protection,
|
||||
but asynchronous backup could have been set as an alternative
|
||||
that is not as latency sensitive.
|
||||
For asynchronous backup, the Block Storage API makes it possible
|
||||
to export the data and also the metadata of a particular volume,
|
||||
so that it can be moved and replicated elsewhere.
|
||||
More information can be found here:
|
||||
https://blueprints.launchpad.net/cinder/+spec/cinder-backup-volume-metadata-support.
|
||||
|
||||
The synchronous backups create an identical volume in both
|
||||
clouds and chooses the appropriate flavor so that each cloud
|
||||
has an identical back end. This is done by creating volumes
|
||||
through the CMP. After this is configured, a solution
|
||||
involving DRDB synchronizes the physical drives.
|
||||
|
||||
The database component is backed up using synchronous backups.
|
||||
MySQL does not support geographically diverse replication,
|
||||
so disaster recovery is provided by replicating the file itself.
|
||||
As it is not possible to use Object Storage as the back end of
|
||||
a database like MySQL, Swift replication is not an option.
|
||||
Company C decides not to store the data on another geo-tiered
|
||||
storage system, such as Ceph, as Block Storage.
|
||||
This would have given another layer of protection.
|
||||
Another option would have been to store the database on an OpenStack
|
||||
Block Storage volume and backing it up like any other Block Storage.
|
192
doc/arch-design-draft/source/arch-examples-multi-site.rst
Normal file
@ -0,0 +1,192 @@
|
||||
=========================
|
||||
Multi-site cloud examples
|
||||
=========================
|
||||
|
||||
There are multiple ways to build a multi-site OpenStack installation,
|
||||
based on the needs of the intended workloads. Below are example
|
||||
architectures based on different requirements. These examples are meant
|
||||
as a reference, and not a hard and fast rule for deployments. Use the
|
||||
previous sections of this chapter to assist in selecting specific
|
||||
components and implementations based on specific needs.
|
||||
|
||||
A large content provider needs to deliver content to customers that are
|
||||
geographically dispersed. The workload is very sensitive to latency and
|
||||
needs a rapid response to end-users. After reviewing the user, technical
|
||||
and operational considerations, it is determined beneficial to build a
|
||||
number of regions local to the customer's edge. Rather than build a few
|
||||
large, centralized data centers, the intent of the architecture is to
|
||||
provide a pair of small data centers in locations that are closer to the
|
||||
customer. In this use case, spreading applications out allows for
|
||||
different horizontal scaling than a traditional compute workload scale.
|
||||
The intent is to scale by creating more copies of the application in
|
||||
closer proximity to the users that need it most, in order to ensure
|
||||
faster response time to user requests. This provider deploys two
|
||||
datacenters at each of the four chosen regions. The implications of this
|
||||
design are based around the method of placing copies of resources in
|
||||
each of the remote regions. Swift objects, Glance images, and block
|
||||
storage need to be manually replicated into each region. This may be
|
||||
beneficial for some systems, such as the case of content service, where
|
||||
only some of the content needs to exist in some but not all regions. A
|
||||
centralized Keystone is recommended to ensure authentication and that
|
||||
access to the API endpoints is easily manageable.
|
||||
|
||||
It is recommended that you install an automated DNS system such as
|
||||
Designate. Application administrators need a way to manage the mapping
|
||||
of which application copy exists in each region and how to reach it,
|
||||
unless an external Dynamic DNS system is available. Designate assists by
|
||||
making the process automatic and by populating the records in the each
|
||||
region's zone.
|
||||
|
||||
Telemetry for each region is also deployed, as each region may grow
|
||||
differently or be used at a different rate. Ceilometer collects each
|
||||
region's meters from each of the controllers and report them back to a
|
||||
central location. This is useful both to the end user and the
|
||||
administrator of the OpenStack environment. The end user will find this
|
||||
method useful, as it makes possible to determine if certain locations
|
||||
are experiencing higher load than others, and take appropriate action.
|
||||
Administrators also benefit by possibly being able to forecast growth
|
||||
per region, rather than expanding the capacity of all regions
|
||||
simultaneously, therefore maximizing the cost-effectiveness of the
|
||||
multi-site design.
|
||||
|
||||
One of the key decisions of running this infrastructure is whether or
|
||||
not to provide a redundancy model. Two types of redundancy and high
|
||||
availability models in this configuration can be implemented. The first
|
||||
type is the availability of central OpenStack components. Keystone can
|
||||
be made highly available in three central data centers that host the
|
||||
centralized OpenStack components. This prevents a loss of any one of the
|
||||
regions causing an outage in service. It also has the added benefit of
|
||||
being able to run a central storage repository as a primary cache for
|
||||
distributing content to each of the regions.
|
||||
|
||||
The second redundancy type is the edge data center itself. A second data
|
||||
center in each of the edge regional locations house a second region near
|
||||
the first region. This ensures that the application does not suffer
|
||||
degraded performance in terms of latency and availability.
|
||||
|
||||
:ref:`ms-customer-edge` depicts the solution designed to have both a
|
||||
centralized set of core data centers for OpenStack services and paired edge
|
||||
data centers:
|
||||
|
||||
.. _ms-customer-edge:
|
||||
|
||||
.. figure:: figures/Multi-Site_Customer_Edge.png
|
||||
|
||||
**Multi-site architecture example**
|
||||
|
||||
Geo-redundant load balancing
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
A large-scale web application has been designed with cloud principles in
|
||||
mind. The application is designed provide service to application store,
|
||||
on a 24/7 basis. The company has typical two tier architecture with a
|
||||
web front-end servicing the customer requests, and a NoSQL database back
|
||||
end storing the information.
|
||||
|
||||
As of late there has been several outages in number of major public
|
||||
cloud providers due to applications running out of a single geographical
|
||||
location. The design therefore should mitigate the chance of a single
|
||||
site causing an outage for their business.
|
||||
|
||||
The solution would consist of the following OpenStack components:
|
||||
|
||||
* A firewall, switches and load balancers on the public facing network
|
||||
connections.
|
||||
|
||||
* OpenStack Controller services running, Networking, dashboard, Block
|
||||
Storage and Compute running locally in each of the three regions.
|
||||
Identity service, Orchestration service, Telemetry service, Image
|
||||
service and Object Storage service can be installed centrally, with
|
||||
nodes in each of the region providing a redundant OpenStack
|
||||
Controller plane throughout the globe.
|
||||
|
||||
* OpenStack Compute nodes running the KVM hypervisor.
|
||||
|
||||
* OpenStack Object Storage for serving static objects such as images
|
||||
can be used to ensure that all images are standardized across all the
|
||||
regions, and replicated on a regular basis.
|
||||
|
||||
* A distributed DNS service available to all regions that allows for
|
||||
dynamic update of DNS records of deployed instances.
|
||||
|
||||
* A geo-redundant load balancing service can be used to service the
|
||||
requests from the customers based on their origin.
|
||||
|
||||
An autoscaling heat template can be used to deploy the application in
|
||||
the three regions. This template includes:
|
||||
|
||||
* Web Servers, running Apache.
|
||||
|
||||
* Appropriate ``user_data`` to populate the central DNS servers upon
|
||||
instance launch.
|
||||
|
||||
* Appropriate Telemetry alarms that maintain state of the application
|
||||
and allow for handling of region or instance failure.
|
||||
|
||||
Another autoscaling Heat template can be used to deploy a distributed
|
||||
MongoDB shard over the three locations, with the option of storing
|
||||
required data on a globally available swift container. According to the
|
||||
usage and load on the database server, additional shards can be
|
||||
provisioned according to the thresholds defined in Telemetry.
|
||||
|
||||
Two data centers would have been sufficient had the requirements been
|
||||
met. But three regions are selected here to avoid abnormal load on a
|
||||
single region in the event of a failure.
|
||||
|
||||
Orchestration is used because of the built-in functionality of
|
||||
autoscaling and auto healing in the event of increased load. Additional
|
||||
configuration management tools, such as Puppet or Chef could also have
|
||||
been used in this scenario, but were not chosen since Orchestration had
|
||||
the appropriate built-in hooks into the OpenStack cloud, whereas the
|
||||
other tools were external and not native to OpenStack. In addition,
|
||||
external tools were not needed since this deployment scenario was
|
||||
straight forward.
|
||||
|
||||
OpenStack Object Storage is used here to serve as a back end for the
|
||||
Image service since it is the most suitable solution for a globally
|
||||
distributed storage solution with its own replication mechanism. Home
|
||||
grown solutions could also have been used including the handling of
|
||||
replication, but were not chosen, because Object Storage is already an
|
||||
intricate part of the infrastructure and a proven solution.
|
||||
|
||||
An external load balancing service was used and not the LBaaS in
|
||||
OpenStack because the solution in OpenStack is not redundant and does
|
||||
not have any awareness of geo location.
|
||||
|
||||
.. _ms-geo-redundant:
|
||||
|
||||
.. figure:: figures/Multi-site_Geo_Redundant_LB.png
|
||||
|
||||
**Multi-site geo-redundant architecture**
|
||||
|
||||
Location-local service
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
A common use for multi-site OpenStack deployment is creating a Content
|
||||
Delivery Network. An application that uses a location-local architecture
|
||||
requires low network latency and proximity to the user to provide an
|
||||
optimal user experience and reduce the cost of bandwidth and transit.
|
||||
The content resides on sites closer to the customer, instead of a
|
||||
centralized content store that requires utilizing higher cost
|
||||
cross-country links.
|
||||
|
||||
This architecture includes a geo-location component that places user
|
||||
requests to the closest possible node. In this scenario, 100% redundancy
|
||||
of content across every site is a goal rather than a requirement, with
|
||||
the intent to maximize the amount of content available within a minimum
|
||||
number of network hops for end users. Despite these differences, the
|
||||
storage replication configuration has significant overlap with that of a
|
||||
geo-redundant load balancing use case.
|
||||
|
||||
In :ref:`ms-shared-keystone`, the application utilizing this multi-site
|
||||
OpenStack install that is location-aware would launch web server or content
|
||||
serving instances on the compute cluster in each site. Requests from clients
|
||||
are first sent to a global services load balancer that determines the location
|
||||
of the client, then routes the request to the closest OpenStack site where the
|
||||
application completes the request.
|
||||
|
||||
.. _ms-shared-keystone:
|
||||
|
||||
.. figure:: figures/Multi-Site_shared_keystone1.png
|
||||
|
||||
**Multi-site shared keystone architecture**
|
166
doc/arch-design-draft/source/arch-examples-network.rst
Normal file
@ -0,0 +1,166 @@
|
||||
==============================
|
||||
Network-focused cloud examples
|
||||
==============================
|
||||
|
||||
An organization designs a large-scale web application with cloud
|
||||
principles in mind. The application scales horizontally in a bursting
|
||||
fashion and generates a high instance count. The application requires an
|
||||
SSL connection to secure data and must not lose connection state to
|
||||
individual servers.
|
||||
|
||||
The figure below depicts an example design for this workload. In this
|
||||
example, a hardware load balancer provides SSL offload functionality and
|
||||
connects to tenant networks in order to reduce address consumption. This
|
||||
load balancer links to the routing architecture as it services the VIP
|
||||
for the application. The router and load balancer use the GRE tunnel ID
|
||||
of the application's tenant network and an IP address within the tenant
|
||||
subnet but outside of the address pool. This is to ensure that the load
|
||||
balancer can communicate with the application's HTTP servers without
|
||||
requiring the consumption of a public IP address.
|
||||
|
||||
Because sessions persist until closed, the routing and switching
|
||||
architecture provides high availability. Switches mesh to each
|
||||
hypervisor and each other, and also provide an MLAG implementation to
|
||||
ensure that layer-2 connectivity does not fail. Routers use VRRP and
|
||||
fully mesh with switches to ensure layer-3 connectivity. Since GRE is
|
||||
provides an overlay network, Networking is present and uses the Open
|
||||
vSwitch agent in GRE tunnel mode. This ensures all devices can reach all
|
||||
other devices and that you can create tenant networks for private
|
||||
addressing links to the load balancer.
|
||||
|
||||
.. figure:: figures/Network_Web_Services1.png
|
||||
|
||||
A web service architecture has many options and optional components. Due
|
||||
to this, it can fit into a large number of other OpenStack designs. A
|
||||
few key components, however, need to be in place to handle the nature of
|
||||
most web-scale workloads. You require the following components:
|
||||
|
||||
* OpenStack Controller services (Image, Identity, Networking and
|
||||
supporting services such as MariaDB and RabbitMQ)
|
||||
|
||||
* OpenStack Compute running KVM hypervisor
|
||||
|
||||
* OpenStack Object Storage
|
||||
|
||||
* Orchestration service
|
||||
|
||||
* Telemetry service
|
||||
|
||||
Beyond the normal Identity, Compute, Image service, and Object Storage
|
||||
components, we recommend the Orchestration service component to handle
|
||||
the proper scaling of workloads to adjust to demand. Due to the
|
||||
requirement for auto-scaling, the design includes the Telemetry service.
|
||||
Web services tend to be bursty in load, have very defined peak and
|
||||
valley usage patterns and, as a result, benefit from automatic scaling
|
||||
of instances based upon traffic. At a network level, a split network
|
||||
configuration works well with databases residing on private tenant
|
||||
networks since these do not emit a large quantity of broadcast traffic
|
||||
and may need to interconnect to some databases for content.
|
||||
|
||||
Load balancing
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
Load balancing spreads requests across multiple instances. This workload
|
||||
scales well horizontally across large numbers of instances. This enables
|
||||
instances to run without publicly routed IP addresses and instead to
|
||||
rely on the load balancer to provide a globally reachable service. Many
|
||||
of these services do not require direct server return. This aids in
|
||||
address planning and utilization at scale since only the virtual IP
|
||||
(VIP) must be public.
|
||||
|
||||
Overlay networks
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
The overlay functionality design includes OpenStack Networking in Open
|
||||
vSwitch GRE tunnel mode. In this case, the layer-3 external routers pair
|
||||
with VRRP, and switches pair with an implementation of MLAG to ensure
|
||||
that you do not lose connectivity with the upstream routing
|
||||
infrastructure.
|
||||
|
||||
Performance tuning
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Network level tuning for this workload is minimal. Quality-of-Service
|
||||
(QoS) applies to these workloads for a middle ground Class Selector
|
||||
depending on existing policies. It is higher than a best effort queue
|
||||
but lower than an Expedited Forwarding or Assured Forwarding queue.
|
||||
Since this type of application generates larger packets with
|
||||
longer-lived connections, you can optimize bandwidth utilization for
|
||||
long duration TCP. Normal bandwidth planning applies here with regards
|
||||
to benchmarking a session's usage multiplied by the expected number of
|
||||
concurrent sessions with overhead.
|
||||
|
||||
Network functions
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
Network functions is a broad category but encompasses workloads that
|
||||
support the rest of a system's network. These workloads tend to consist
|
||||
of large amounts of small packets that are very short lived, such as DNS
|
||||
queries or SNMP traps. These messages need to arrive quickly and do not
|
||||
deal with packet loss as there can be a very large volume of them. There
|
||||
are a few extra considerations to take into account for this type of
|
||||
workload and this can change a configuration all the way to the
|
||||
hypervisor level. For an application that generates 10 TCP sessions per
|
||||
user with an average bandwidth of 512 kilobytes per second per flow and
|
||||
expected user count of ten thousand concurrent users, the expected
|
||||
bandwidth plan is approximately 4.88 gigabits per second.
|
||||
|
||||
The supporting network for this type of configuration needs to have a
|
||||
low latency and evenly distributed availability. This workload benefits
|
||||
from having services local to the consumers of the service. Use a
|
||||
multi-site approach as well as deploying many copies of the application
|
||||
to handle load as close as possible to consumers. Since these
|
||||
applications function independently, they do not warrant running
|
||||
overlays to interconnect tenant networks. Overlays also have the
|
||||
drawback of performing poorly with rapid flow setup and may incur too
|
||||
much overhead with large quantities of small packets and therefore we do
|
||||
not recommend them.
|
||||
|
||||
QoS is desirable for some workloads to ensure delivery. DNS has a major
|
||||
impact on the load times of other services and needs to be reliable and
|
||||
provide rapid responses. Configure rules in upstream devices to apply a
|
||||
higher Class Selector to DNS to ensure faster delivery or a better spot
|
||||
in queuing algorithms.
|
||||
|
||||
Cloud storage
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
Another common use case for OpenStack environments is providing a
|
||||
cloud-based file storage and sharing service. You might consider this a
|
||||
storage-focused use case, but its network-side requirements make it a
|
||||
network-focused use case.
|
||||
|
||||
For example, consider a cloud backup application. This workload has two
|
||||
specific behaviors that impact the network. Because this workload is an
|
||||
externally-facing service and an internally-replicating application, it
|
||||
has both :term:`north-south<north-south traffic>` and
|
||||
:term:`east-west<east-west traffic>` traffic considerations:
|
||||
|
||||
north-south traffic
|
||||
When a user uploads and stores content, that content moves into the
|
||||
OpenStack installation. When users download this content, the
|
||||
content moves out from the OpenStack installation. Because this
|
||||
service operates primarily as a backup, most of the traffic moves
|
||||
southbound into the environment. In this situation, it benefits you
|
||||
to configure a network to be asymmetrically downstream because the
|
||||
traffic that enters the OpenStack installation is greater than the
|
||||
traffic that leaves the installation.
|
||||
|
||||
east-west traffic
|
||||
Likely to be fully symmetric. Because replication originates from
|
||||
any node and might target multiple other nodes algorithmically, it
|
||||
is less likely for this traffic to have a larger volume in any
|
||||
specific direction. However this traffic might interfere with
|
||||
north-south traffic.
|
||||
|
||||
.. figure:: figures/Network_Cloud_Storage2.png
|
||||
|
||||
This application prioritizes the north-south traffic over east-west
|
||||
traffic: the north-south traffic involves customer-facing data.
|
||||
|
||||
The network design in this case is less dependent on availability and
|
||||
more dependent on being able to handle high bandwidth. As a direct
|
||||
result, it is beneficial to forgo redundant links in favor of bonding
|
||||
those connections. This increases available bandwidth. It is also
|
||||
beneficial to configure all devices in the path, including OpenStack, to
|
||||
generate and pass jumbo frames.
|
42
doc/arch-design-draft/source/arch-examples-specialized.rst
Normal file
@ -0,0 +1,42 @@
|
||||
=================
|
||||
Specialized cases
|
||||
=================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
specialized-multi-hypervisor.rst
|
||||
specialized-networking.rst
|
||||
specialized-software-defined-networking.rst
|
||||
specialized-desktop-as-a-service.rst
|
||||
specialized-openstack-on-openstack.rst
|
||||
specialized-hardware.rst
|
||||
specialized-single-site.rst
|
||||
specialized-add-region.rst
|
||||
specialized-scaling-multiple-cells.rst
|
||||
|
||||
Although OpenStack architecture designs have been described
|
||||
in seven major scenarios outlined in other sections
|
||||
(compute focused, network focused, storage focused, general
|
||||
purpose, multi-site, hybrid cloud, and massively scalable),
|
||||
there are a few use cases that do not fit into these categories.
|
||||
This section discusses these specialized cases and provide some
|
||||
additional details and design considerations for each use case:
|
||||
|
||||
* :doc:`Specialized networking <specialized-networking>`:
|
||||
describes running networking-oriented software that may involve reading
|
||||
packets directly from the wire or participating in routing protocols.
|
||||
* :doc:`Software-defined networking (SDN)
|
||||
<specialized-software-defined-networking>`:
|
||||
describes both running an SDN controller from within OpenStack
|
||||
as well as participating in a software-defined network.
|
||||
* :doc:`Desktop-as-a-Service <specialized-desktop-as-a-service>`:
|
||||
describes running a virtualized desktop environment in a cloud
|
||||
(:term:`Desktop-as-a-Service`).
|
||||
This applies to private and public clouds.
|
||||
* :doc:`OpenStack on OpenStack <specialized-openstack-on-openstack>`:
|
||||
describes building a multi-tiered cloud by running OpenStack
|
||||
on top of an OpenStack installation.
|
||||
* :doc:`Specialized hardware <specialized-hardware>`:
|
||||
describes the use of specialized hardware devices from within
|
||||
the OpenStack environment.
|
143
doc/arch-design-draft/source/arch-examples-storage.rst
Normal file
@ -0,0 +1,143 @@
|
||||
==============================
|
||||
Storage-focused cloud examples
|
||||
==============================
|
||||
|
||||
Storage-focused architecture depends on specific use cases. This section
|
||||
discusses three example use cases:
|
||||
|
||||
* An object store with a RESTful interface
|
||||
|
||||
* Compute analytics with parallel file systems
|
||||
|
||||
* High performance database
|
||||
|
||||
The example below shows a REST interface without a high performance
|
||||
requirement.
|
||||
|
||||
Swift is a highly scalable object store that is part of the OpenStack
|
||||
project. This diagram explains the example architecture:
|
||||
|
||||
.. figure:: figures/Storage_Object.png
|
||||
|
||||
The example REST interface, presented as a traditional Object store
|
||||
running on traditional spindles, does not require a high performance
|
||||
caching tier.
|
||||
|
||||
This example uses the following components:
|
||||
|
||||
Network:
|
||||
|
||||
* 10 GbE horizontally scalable spine leaf back-end storage and front
|
||||
end network.
|
||||
|
||||
Storage hardware:
|
||||
|
||||
* 10 storage servers each with 12x4 TB disks equaling 480 TB total
|
||||
space with approximately 160 TB of usable space after replicas.
|
||||
|
||||
Proxy:
|
||||
|
||||
* 3x proxies
|
||||
|
||||
* 2x10 GbE bonded front end
|
||||
|
||||
* 2x10 GbE back-end bonds
|
||||
|
||||
* Approximately 60 Gb of total bandwidth to the back-end storage
|
||||
cluster
|
||||
|
||||
.. note::
|
||||
|
||||
It may be necessary to implement a 3rd-party caching layer for some
|
||||
applications to achieve suitable performance.
|
||||
|
||||
Compute analytics with Data processing service
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Analytics of large data sets are dependent on the performance of the
|
||||
storage system. Clouds using storage systems such as Hadoop Distributed
|
||||
File System (HDFS) have inefficiencies which can cause performance
|
||||
issues.
|
||||
|
||||
One potential solution to this problem is the implementation of storage
|
||||
systems designed for performance. Parallel file systems have previously
|
||||
filled this need in the HPC space and are suitable for large scale
|
||||
performance-orientated systems.
|
||||
|
||||
OpenStack has integration with Hadoop to manage the Hadoop cluster
|
||||
within the cloud. The following diagram shows an OpenStack store with a
|
||||
high performance requirement:
|
||||
|
||||
.. figure:: figures/Storage_Hadoop3.png
|
||||
|
||||
The hardware requirements and configuration are similar to those of the
|
||||
High Performance Database example below. In this case, the architecture
|
||||
uses Ceph's Swift-compatible REST interface, features that allow for
|
||||
connecting a caching pool to allow for acceleration of the presented
|
||||
pool.
|
||||
|
||||
High performance database with Database service
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Databases are a common workload that benefit from high performance
|
||||
storage back ends. Although enterprise storage is not a requirement,
|
||||
many environments have existing storage that OpenStack cloud can use as
|
||||
back ends. You can create a storage pool to provide block devices with
|
||||
OpenStack Block Storage for instances as well as object interfaces. In
|
||||
this example, the database I-O requirements are high and demand storage
|
||||
presented from a fast SSD pool.
|
||||
|
||||
A storage system presents a LUN backed by a set of SSDs using a
|
||||
traditional storage array with OpenStack Block Storage integration or a
|
||||
storage platform such as Ceph or Gluster.
|
||||
|
||||
This system can provide additional performance. For example, in the
|
||||
database example below, a portion of the SSD pool can act as a block
|
||||
device to the Database server. In the high performance analytics
|
||||
example, the inline SSD cache layer accelerates the REST interface.
|
||||
|
||||
.. figure:: figures/Storage_Database_+_Object5.png
|
||||
|
||||
In this example, Ceph presents a Swift-compatible REST interface, as
|
||||
well as a block level storage from a distributed storage cluster. It is
|
||||
highly flexible and has features that enable reduced cost of operations
|
||||
such as self healing and auto balancing. Using erasure coded pools are a
|
||||
suitable way of maximizing the amount of usable space.
|
||||
|
||||
.. note::
|
||||
|
||||
There are special considerations around erasure coded pools. For
|
||||
example, higher computational requirements and limitations on the
|
||||
operations allowed on an object; erasure coded pools do not support
|
||||
partial writes.
|
||||
|
||||
Using Ceph as an applicable example, a potential architecture would have
|
||||
the following requirements:
|
||||
|
||||
Network:
|
||||
|
||||
* 10 GbE horizontally scalable spine leaf back-end storage and
|
||||
front-end network
|
||||
|
||||
Storage hardware:
|
||||
|
||||
* 5 storage servers for caching layer 24x1 TB SSD
|
||||
|
||||
* 10 storage servers each with 12x4 TB disks which equals 480 TB total
|
||||
space with about approximately 160 TB of usable space after 3
|
||||
replicas
|
||||
|
||||
REST proxy:
|
||||
|
||||
* 3x proxies
|
||||
|
||||
* 2x10 GbE bonded front end
|
||||
|
||||
* 2x10 GbE back-end bonds
|
||||
|
||||
* Approximately 60 Gb of total bandwidth to the back-end storage
|
||||
cluster
|
||||
|
||||
Using an SSD cache layer, you can present block devices directly to
|
||||
hypervisors or instances. The REST interface can also use the SSD cache
|
||||
systems as an inline cache.
|
14
doc/arch-design-draft/source/arch-examples.rst
Normal file
@ -0,0 +1,14 @@
|
||||
===========================
|
||||
Cloud architecture examples
|
||||
===========================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
arch-examples-general.rst
|
||||
arch-examples-compute.rst
|
||||
arch-examples-storage.rst
|
||||
arch-examples-network.rst
|
||||
arch-examples-multi-site.rst
|
||||
arch-examples-hybrid.rst
|
||||
arch-examples-specialized.rst
|
@ -1,9 +0,0 @@
|
||||
=====================
|
||||
Example architectures
|
||||
=====================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
|
||||
|
BIN
doc/arch-design-draft/source/figures/Compute_NSX.png
Normal file
After Width: | Height: | Size: 52 KiB |
After Width: | Height: | Size: 39 KiB |
After Width: | Height: | Size: 35 KiB |
BIN
doc/arch-design-draft/source/figures/General_Architecture3.png
Normal file
After Width: | Height: | Size: 79 KiB |
After Width: | Height: | Size: 70 KiB |
BIN
doc/arch-design-draft/source/figures/Generic_CERN_Example.png
Normal file
After Width: | Height: | Size: 24 KiB |
After Width: | Height: | Size: 42 KiB |
BIN
doc/arch-design-draft/source/figures/Multi-Cloud_Priv-AWS4.png
Normal file
After Width: | Height: | Size: 59 KiB |
BIN
doc/arch-design-draft/source/figures/Multi-Cloud_Priv-Pub3.png
Normal file
After Width: | Height: | Size: 54 KiB |
BIN
doc/arch-design-draft/source/figures/Multi-Cloud_failover2.png
Normal file
After Width: | Height: | Size: 54 KiB |
After Width: | Height: | Size: 68 KiB |
After Width: | Height: | Size: 50 KiB |
After Width: | Height: | Size: 52 KiB |
After Width: | Height: | Size: 75 KiB |
BIN
doc/arch-design-draft/source/figures/Network_Cloud_Storage2.png
Normal file
After Width: | Height: | Size: 37 KiB |
BIN
doc/arch-design-draft/source/figures/Network_Web_Services1.png
Normal file
After Width: | Height: | Size: 56 KiB |
BIN
doc/arch-design-draft/source/figures/Specialized_Hardware2.png
Normal file
After Width: | Height: | Size: 46 KiB |
BIN
doc/arch-design-draft/source/figures/Specialized_OOO.png
Normal file
After Width: | Height: | Size: 56 KiB |
After Width: | Height: | Size: 30 KiB |
BIN
doc/arch-design-draft/source/figures/Specialized_SDN_hosted.png
Normal file
After Width: | Height: | Size: 22 KiB |
BIN
doc/arch-design-draft/source/figures/Specialized_VDI1.png
Normal file
After Width: | Height: | Size: 25 KiB |
After Width: | Height: | Size: 50 KiB |
BIN
doc/arch-design-draft/source/figures/Storage_Hadoop3.png
Normal file
After Width: | Height: | Size: 50 KiB |
BIN
doc/arch-design-draft/source/figures/Storage_Object.png
Normal file
After Width: | Height: | Size: 35 KiB |
@ -32,7 +32,7 @@ Contents
|
||||
high-availability.rst
|
||||
security-requirements.rst
|
||||
legal-requirements.rst
|
||||
example-architectures.rst
|
||||
arch-examples.rst
|
||||
common/app_support.rst
|
||||
common/glossary.rst
|
||||
|
||||
|
5
doc/arch-design-draft/source/specialized-add-region.rst
Normal file
@ -0,0 +1,5 @@
|
||||
=====================
|
||||
Adding another region
|
||||
=====================
|
||||
|
||||
.. TODO
|
@ -0,0 +1,47 @@
|
||||
====================
|
||||
Desktop-as-a-Service
|
||||
====================
|
||||
|
||||
Virtual Desktop Infrastructure (VDI) is a service that hosts
|
||||
user desktop environments on remote servers. This application
|
||||
is very sensitive to network latency and requires a high
|
||||
performance compute environment. Traditionally these types of
|
||||
services do not use cloud environments because few clouds
|
||||
support such a demanding workload for user-facing applications.
|
||||
As cloud environments become more robust, vendors are starting
|
||||
to provide services that provide virtual desktops in the cloud.
|
||||
OpenStack may soon provide the infrastructure for these types of deployments.
|
||||
|
||||
Challenges
|
||||
~~~~~~~~~~
|
||||
|
||||
Designing an infrastructure that is suitable to host virtual
|
||||
desktops is a very different task to that of most virtual workloads.
|
||||
For example, the design must consider:
|
||||
|
||||
* Boot storms, when a high volume of logins occur in a short period of time
|
||||
* The performance of the applications running on virtual desktops
|
||||
* Operating systems and their compatibility with the OpenStack hypervisor
|
||||
|
||||
Broker
|
||||
~~~~~~
|
||||
|
||||
The connection broker determines which remote desktop host
|
||||
users can access. Medium and large scale environments require a broker
|
||||
since its service represents a central component of the architecture.
|
||||
The broker is a complete management product, and enables automated
|
||||
deployment and provisioning of remote desktop hosts.
|
||||
|
||||
Possible solutions
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
There are a number of commercial products currently available that
|
||||
provide a broker solution. However, no native OpenStack projects
|
||||
provide broker services.
|
||||
Not providing a broker is also an option, but managing this manually
|
||||
would not suffice for a large scale, enterprise solution.
|
||||
|
||||
Diagram
|
||||
~~~~~~~
|
||||
|
||||
.. figure:: figures/Specialized_VDI1.png
|
43
doc/arch-design-draft/source/specialized-hardware.rst
Normal file
@ -0,0 +1,43 @@
|
||||
====================
|
||||
Specialized hardware
|
||||
====================
|
||||
|
||||
Certain workloads require specialized hardware devices that
|
||||
have significant virtualization or sharing challenges.
|
||||
Applications such as load balancers, highly parallel brute
|
||||
force computing, and direct to wire networking may need
|
||||
capabilities that basic OpenStack components do not provide.
|
||||
|
||||
Challenges
|
||||
~~~~~~~~~~
|
||||
|
||||
Some applications need access to hardware devices to either
|
||||
improve performance or provide capabilities that are not
|
||||
virtual CPU, RAM, network, or storage. These can be a shared
|
||||
resource, such as a cryptography processor, or a dedicated
|
||||
resource, such as a Graphics Processing Unit (GPU). OpenStack can
|
||||
provide some of these, while others may need extra work.
|
||||
|
||||
Solutions
|
||||
~~~~~~~~~
|
||||
|
||||
To provide cryptography offloading to a set of instances,
|
||||
you can use Image service configuration options.
|
||||
For example, assign the cryptography chip to a device node in the guest.
|
||||
The OpenStack Command Line Reference contains further information on
|
||||
configuring this solution in the section `Image service property keys
|
||||
<http://docs.openstack.org/cli-reference/glance.html#image-service-property-keys>`_.
|
||||
A challenge, however, is this option allows all guests using the
|
||||
configured images to access the hypervisor cryptography device.
|
||||
|
||||
If you require direct access to a specific device, PCI pass-through
|
||||
enables you to dedicate the device to a single instance per hypervisor.
|
||||
You must define a flavor that has the PCI device specifically in order
|
||||
to properly schedule instances.
|
||||
More information regarding PCI pass-through, including instructions for
|
||||
implementing and using it, is available at
|
||||
`https://wiki.openstack.org/wiki/Pci_passthrough <https://wiki.openstack.org/
|
||||
wiki/Pci_passthrough#How_to_check_PCI_status_with_PCI_api_patches>`_.
|
||||
|
||||
.. figure:: figures/Specialized_Hardware2.png
|
||||
:width: 100%
|
@ -0,0 +1,78 @@
|
||||
========================
|
||||
Multi-hypervisor example
|
||||
========================
|
||||
|
||||
A financial company requires its applications migrated
|
||||
from a traditional, virtualized environment to an API driven,
|
||||
orchestrated environment. The new environment needs
|
||||
multiple hypervisors since many of the company's applications
|
||||
have strict hypervisor requirements.
|
||||
|
||||
Currently, the company's vSphere environment runs 20 VMware
|
||||
ESXi hypervisors. These hypervisors support 300 instances of
|
||||
various sizes. Approximately 50 of these instances must run
|
||||
on ESXi. The remaining 250 or so have more flexible requirements.
|
||||
|
||||
The financial company decides to manage the
|
||||
overall system with a common OpenStack platform.
|
||||
|
||||
.. figure:: figures/Compute_NSX.png
|
||||
:width: 100%
|
||||
|
||||
Architecture planning teams decided to run a host aggregate
|
||||
containing KVM hypervisors for the general purpose instances.
|
||||
A separate host aggregate targets instances requiring ESXi.
|
||||
|
||||
Images in the OpenStack Image service have particular
|
||||
hypervisor metadata attached. When a user requests a
|
||||
certain image, the instance spawns on the relevant aggregate.
|
||||
|
||||
Images for ESXi use the VMDK format. You can convert
|
||||
QEMU disk images to VMDK, VMFS Flat Disks. These disk images
|
||||
can also be thin, thick, zeroed-thick, and eager-zeroed-thick.
|
||||
After exporting a VMFS thin disk from VMFS to the
|
||||
OpenStack Image service (a non-VMFS location), it becomes a
|
||||
preallocated flat disk. This impacts the transfer time from the
|
||||
OpenStack Image service to the data store since transfers require
|
||||
moving the full preallocated flat disk rather than the thin disk.
|
||||
|
||||
The VMware host aggregate compute nodes communicate with
|
||||
vCenter rather than spawning directly on a hypervisor.
|
||||
The vCenter then requests scheduling for the instance to run on
|
||||
an ESXi hypervisor.
|
||||
|
||||
This functionality requires that VMware Distributed Resource
|
||||
Scheduler (DRS) is enabled on a cluster and set to **Fully Automated**.
|
||||
The vSphere requires shared storage because the DRS uses vMotion
|
||||
which is a service that relies on shared storage.
|
||||
|
||||
This solution to the company's migration uses shared storage
|
||||
to provide Block Storage capabilities to the KVM instances while
|
||||
also providing vSphere storage. The new environment provides this
|
||||
storage functionality using a dedicated data network. The
|
||||
compute hosts should have dedicated NICs to support the
|
||||
dedicated data network. vSphere supports OpenStack Block Storage. This
|
||||
support gives storage from a VMFS datastore to an instance. For the
|
||||
financial company, Block Storage in their new architecture supports
|
||||
both hypervisors.
|
||||
|
||||
OpenStack Networking provides network connectivity in this new
|
||||
architecture, with the VMware NSX plug-in driver configured. legacy
|
||||
networking (nova-network) supports both hypervisors in this new
|
||||
architecture example, but has limitations. Specifically, vSphere
|
||||
with legacy networking does not support security groups. The new
|
||||
architecture uses VMware NSX as a part of the design. When users launch an
|
||||
instance within either of the host aggregates, VMware NSX ensures the
|
||||
instance attaches to the appropriate network overlay-based logical networks.
|
||||
|
||||
The architecture planning teams also consider OpenStack Compute integration.
|
||||
When running vSphere in an OpenStack environment, nova-compute
|
||||
communications with vCenter appear as a single large hypervisor.
|
||||
This hypervisor represents the entire ESXi cluster. Multiple nova-compute
|
||||
instances can represent multiple ESXi clusters. They can connect to
|
||||
multiple vCenter servers. If the process running nova-compute
|
||||
crashes it cuts the connection to the vCenter server.
|
||||
Any ESXi clusters will stop running, and you will not be able to
|
||||
provision further instances on the vCenter, even if you enable high
|
||||
availability. You must monitor the nova-compute service connected
|
||||
to vSphere carefully for any disruptions as a result of this failure point.
|
32
doc/arch-design-draft/source/specialized-networking.rst
Normal file
@ -0,0 +1,32 @@
|
||||
==============================
|
||||
Specialized networking example
|
||||
==============================
|
||||
|
||||
Some applications that interact with a network require
|
||||
specialized connectivity. Applications such as a looking glass
|
||||
require the ability to connect to a BGP peer, or route participant
|
||||
applications may need to join a network at a layer2 level.
|
||||
|
||||
Challenges
|
||||
~~~~~~~~~~
|
||||
|
||||
Connecting specialized network applications to their required
|
||||
resources alters the design of an OpenStack installation.
|
||||
Installations that rely on overlay networks are unable to
|
||||
support a routing participant, and may also block layer-2 listeners.
|
||||
|
||||
Possible solutions
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Deploying an OpenStack installation using OpenStack Networking with a
|
||||
provider network allows direct layer-2 connectivity to an
|
||||
upstream networking device.
|
||||
This design provides the layer-2 connectivity required to communicate
|
||||
via Intermediate System-to-Intermediate System (ISIS) protocol or
|
||||
to pass packets controlled by an OpenFlow controller.
|
||||
Using the multiple layer-2 plug-in with an agent such as
|
||||
:term:`Open vSwitch` allows a private connection through a VLAN
|
||||
directly to a specific port in a layer-3 device.
|
||||
This allows a BGP point-to-point link to join the autonomous system.
|
||||
Avoid using layer-3 plug-ins as they divide the broadcast
|
||||
domain and prevent router adjacencies from forming.
|
@ -0,0 +1,70 @@
|
||||
======================
|
||||
OpenStack on OpenStack
|
||||
======================
|
||||
|
||||
In some cases, users may run OpenStack nested on top
|
||||
of another OpenStack cloud. This scenario describes how to
|
||||
manage and provision complete OpenStack environments on instances
|
||||
supported by hypervisors and servers, which an underlying OpenStack
|
||||
environment controls.
|
||||
|
||||
Public cloud providers can use this technique to manage the
|
||||
upgrade and maintenance process on complete OpenStack environments.
|
||||
Developers and those testing OpenStack can also use this
|
||||
technique to provision their own OpenStack environments on
|
||||
available OpenStack Compute resources, whether public or private.
|
||||
|
||||
Challenges
|
||||
~~~~~~~~~~
|
||||
|
||||
The network aspect of deploying a nested cloud is the most
|
||||
complicated aspect of this architecture.
|
||||
You must expose VLANs to the physical ports on which the underlying
|
||||
cloud runs because the bare metal cloud owns all the hardware.
|
||||
You must also expose them to the nested levels as well.
|
||||
Alternatively, you can use the network overlay technologies on the
|
||||
OpenStack environment running on the host OpenStack environment to
|
||||
provide the required software defined networking for the deployment.
|
||||
|
||||
Hypervisor
|
||||
~~~~~~~~~~
|
||||
|
||||
In this example architecture, consider which
|
||||
approach you should take to provide a nested
|
||||
hypervisor in OpenStack. This decision influences which
|
||||
operating systems you use for the deployment of the nested
|
||||
OpenStack deployments.
|
||||
|
||||
Possible solutions: deployment
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Deployment of a full stack can be challenging but you can mitigate
|
||||
this difficulty by creating a Heat template to deploy the
|
||||
entire stack, or a configuration management system. After creating
|
||||
the Heat template, you can automate the deployment of additional stacks.
|
||||
|
||||
The OpenStack-on-OpenStack project (:term:`TripleO`)
|
||||
addresses this issue. Currently, however, the project does
|
||||
not completely cover nested stacks. For more information, see
|
||||
https://wiki.openstack.org/wiki/TripleO.
|
||||
|
||||
Possible solutions: hypervisor
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In the case of running TripleO, the underlying OpenStack
|
||||
cloud deploys the Compute nodes as bare-metal. You then deploy
|
||||
OpenStack on these Compute bare-metal servers with the
|
||||
appropriate hypervisor, such as KVM.
|
||||
|
||||
In the case of running smaller OpenStack clouds for testing
|
||||
purposes, where performance is not a critical factor, you can use
|
||||
QEMU instead. It is also possible to run a KVM hypervisor in an instance
|
||||
(see http://davejingtian.org/2014/03/30/nested-kvm-just-for-fun/),
|
||||
though this is not a supported configuration, and could be a
|
||||
complex solution for such a use case.
|
||||
|
||||
Diagram
|
||||
~~~~~~~
|
||||
|
||||
.. figure:: figures/Specialized_OOO.png
|
||||
:width: 100%
|
@ -0,0 +1,5 @@
|
||||
======================
|
||||
Scaling multiple cells
|
||||
======================
|
||||
|
||||
.. TODO
|
5
doc/arch-design-draft/source/specialized-single-site.rst
Normal file
@ -0,0 +1,5 @@
|
||||
==================================================
|
||||
Single site architecture with OpenStack Networking
|
||||
==================================================
|
||||
|
||||
.. TODO
|
@ -0,0 +1,46 @@
|
||||
===========================
|
||||
Software-defined networking
|
||||
===========================
|
||||
|
||||
Software-defined networking (SDN) is the separation of the data
|
||||
plane and control plane. SDN is a popular method of
|
||||
managing and controlling packet flows within networks.
|
||||
SDN uses overlays or directly controlled layer-2 devices to
|
||||
determine flow paths, and as such presents challenges to a
|
||||
cloud environment. Some designers may wish to run their
|
||||
controllers within an OpenStack installation. Others may wish
|
||||
to have their installations participate in an SDN-controlled network.
|
||||
|
||||
Challenges
|
||||
~~~~~~~~~~
|
||||
|
||||
SDN is a relatively new concept that is not yet standardized,
|
||||
so SDN systems come in a variety of different implementations.
|
||||
Because of this, a truly prescriptive architecture is not feasible.
|
||||
Instead, examine the differences between an existing and a planned
|
||||
OpenStack design and determine where potential conflicts and gaps exist.
|
||||
|
||||
Possible solutions
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
If an SDN implementation requires layer-2 access because it
|
||||
directly manipulates switches, we do not recommend running an
|
||||
overlay network or a layer-3 agent.
|
||||
If the controller resides within an OpenStack installation,
|
||||
it may be necessary to build an ML2 plug-in and schedule the
|
||||
controller instances to connect to tenant VLANs that they can
|
||||
talk directly to the switch hardware.
|
||||
Alternatively, depending on the external device support,
|
||||
use a tunnel that terminates at the switch hardware itself.
|
||||
|
||||
Diagram
|
||||
-------
|
||||
|
||||
OpenStack hosted SDN controller:
|
||||
|
||||
.. figure:: figures/Specialized_SDN_hosted.png
|
||||
|
||||
OpenStack participating in an SDN controller network:
|
||||
|
||||
.. figure:: figures/Specialized_SDN_external.png
|
||||
|