Merge "[arch-design] Migrate arch content from ops-guide"
This commit is contained in:
commit
bdc95b5e90
@ -1,3 +1,295 @@
|
|||||||
=======
|
=============
|
||||||
Compute
|
Compute Nodes
|
||||||
=======
|
=============
|
||||||
|
|
||||||
|
This chapter describes some of the choices you need to consider
|
||||||
|
when designing and building your compute nodes. Compute nodes form the
|
||||||
|
resource core of the OpenStack Compute cloud, providing the processing, memory,
|
||||||
|
network and storage resources to run instances.
|
||||||
|
|
||||||
|
Choosing a CPU
|
||||||
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The type of CPU in your compute node is a very important choice. First,
|
||||||
|
ensure that the CPU supports virtualization by way of *VT-x* for Intel
|
||||||
|
chips and *AMD-v* for AMD chips.
|
||||||
|
|
||||||
|
.. tip::
|
||||||
|
|
||||||
|
Consult the vendor documentation to check for virtualization
|
||||||
|
support. For Intel, read `“Does my processor support Intel® Virtualization
|
||||||
|
Technology?” <http://www.intel.com/support/processors/sb/cs-030729.htm>`_.
|
||||||
|
For AMD, read `AMD Virtualization
|
||||||
|
<http://www.amd.com/en-us/innovations/software-technologies/server-solution/virtualization>`_.
|
||||||
|
Note that your CPU may support virtualization but it may be
|
||||||
|
disabled. Consult your BIOS documentation for how to enable CPU
|
||||||
|
features.
|
||||||
|
|
||||||
|
The number of cores that the CPU has also affects the decision. It's
|
||||||
|
common for current CPUs to have up to 12 cores. Additionally, if an
|
||||||
|
Intel CPU supports hyperthreading, those 12 cores are doubled to 24
|
||||||
|
cores. If you purchase a server that supports multiple CPUs, the number
|
||||||
|
of cores is further multiplied.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
**Multithread Considerations**
|
||||||
|
|
||||||
|
Hyper-Threading is Intel's proprietary simultaneous multithreading
|
||||||
|
implementation used to improve parallelization on their CPUs. You might
|
||||||
|
consider enabling Hyper-Threading to improve the performance of
|
||||||
|
multithreaded applications.
|
||||||
|
|
||||||
|
Whether you should enable Hyper-Threading on your CPUs depends upon your
|
||||||
|
use case. For example, disabling Hyper-Threading can be beneficial in
|
||||||
|
intense computing environments. We recommend that you do performance
|
||||||
|
testing with your local workload with both Hyper-Threading on and off to
|
||||||
|
determine what is more appropriate in your case.
|
||||||
|
|
||||||
|
Choosing a Hypervisor
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
A hypervisor provides software to manage virtual machine access to the
|
||||||
|
underlying hardware. The hypervisor creates, manages, and monitors
|
||||||
|
virtual machines. OpenStack Compute supports many hypervisors to various
|
||||||
|
degrees, including:
|
||||||
|
|
||||||
|
* `KVM <http://www.linux-kvm.org/page/Main_Page>`_
|
||||||
|
* `LXC <https://linuxcontainers.org/>`_
|
||||||
|
* `QEMU <http://wiki.qemu.org/Main_Page>`_
|
||||||
|
* `VMware ESX/ESXi <https://www.vmware.com/support/vsphere-hypervisor>`_
|
||||||
|
* `Xen <http://www.xenproject.org/>`_
|
||||||
|
* `Hyper-V <http://technet.microsoft.com/en-us/library/hh831531.aspx>`_
|
||||||
|
* `Docker <https://www.docker.com/>`_
|
||||||
|
|
||||||
|
Probably the most important factor in your choice of hypervisor is your
|
||||||
|
current usage or experience. Aside from that, there are practical
|
||||||
|
concerns to do with feature parity, documentation, and the level of
|
||||||
|
community experience.
|
||||||
|
|
||||||
|
For example, KVM is the most widely adopted hypervisor in the OpenStack
|
||||||
|
community. Besides KVM, more deployments run Xen, LXC, VMware, and
|
||||||
|
Hyper-V than the others listed. However, each of these are lacking some
|
||||||
|
feature support or the documentation on how to use them with OpenStack
|
||||||
|
is out of date.
|
||||||
|
|
||||||
|
The best information available to support your choice is found on the
|
||||||
|
`Hypervisor Support Matrix
|
||||||
|
<http://docs.openstack.org/developer/nova/support-matrix.html>`_
|
||||||
|
and in the `configuration reference
|
||||||
|
<http://docs.openstack.org/mitaka/config-reference/compute/hypervisors.html>`_.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
It is also possible to run multiple hypervisors in a single
|
||||||
|
deployment using host aggregates or cells. However, an individual
|
||||||
|
compute node can run only a single hypervisor at a time.
|
||||||
|
|
||||||
|
Instance Storage Solutions
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
As part of the procurement for a compute cluster, you must specify some
|
||||||
|
storage for the disk on which the instantiated instance runs. There are
|
||||||
|
three main approaches to providing this temporary-style storage, and it
|
||||||
|
is important to understand the implications of the choice.
|
||||||
|
|
||||||
|
They are:
|
||||||
|
|
||||||
|
* Off compute node storage—shared file system
|
||||||
|
* On compute node storage—shared file system
|
||||||
|
* On compute node storage—nonshared file system
|
||||||
|
|
||||||
|
In general, the questions you should ask when selecting storage are as
|
||||||
|
follows:
|
||||||
|
|
||||||
|
* What is the platter count you can achieve?
|
||||||
|
* Do more spindles result in better I/O despite network access?
|
||||||
|
* Which one results in the best cost-performance scenario you are aiming for?
|
||||||
|
* How do you manage the storage operationally?
|
||||||
|
|
||||||
|
Many operators use separate compute and storage hosts. Compute services
|
||||||
|
and storage services have different requirements, and compute hosts
|
||||||
|
typically require more CPU and RAM than storage hosts. Therefore, for a
|
||||||
|
fixed budget, it makes sense to have different configurations for your
|
||||||
|
compute nodes and your storage nodes. Compute nodes will be invested in
|
||||||
|
CPU and RAM, and storage nodes will be invested in block storage.
|
||||||
|
|
||||||
|
However, if you are more restricted in the number of physical hosts you
|
||||||
|
have available for creating your cloud and you want to be able to
|
||||||
|
dedicate as many of your hosts as possible to running instances, it
|
||||||
|
makes sense to run compute and storage on the same machines.
|
||||||
|
|
||||||
|
The three main approaches to instance storage are provided in the next
|
||||||
|
few sections.
|
||||||
|
|
||||||
|
Off Compute Node Storage—Shared File System
|
||||||
|
-------------------------------------------
|
||||||
|
|
||||||
|
In this option, the disks storing the running instances are hosted in
|
||||||
|
servers outside of the compute nodes.
|
||||||
|
|
||||||
|
If you use separate compute and storage hosts, you can treat your
|
||||||
|
compute hosts as "stateless." As long as you don't have any instances
|
||||||
|
currently running on a compute host, you can take it offline or wipe it
|
||||||
|
completely without having any effect on the rest of your cloud. This
|
||||||
|
simplifies maintenance for the compute hosts.
|
||||||
|
|
||||||
|
There are several advantages to this approach:
|
||||||
|
|
||||||
|
* If a compute node fails, instances are usually easily recoverable.
|
||||||
|
* Running a dedicated storage system can be operationally simpler.
|
||||||
|
* You can scale to any number of spindles.
|
||||||
|
* It may be possible to share the external storage for other purposes.
|
||||||
|
|
||||||
|
The main downsides to this approach are:
|
||||||
|
|
||||||
|
* Depending on design, heavy I/O usage from some instances can affect
|
||||||
|
unrelated instances.
|
||||||
|
* Use of the network can decrease performance.
|
||||||
|
|
||||||
|
On Compute Node Storage—Shared File System
|
||||||
|
------------------------------------------
|
||||||
|
|
||||||
|
In this option, each compute node is specified with a significant amount
|
||||||
|
of disk space, but a distributed file system ties the disks from each
|
||||||
|
compute node into a single mount.
|
||||||
|
|
||||||
|
The main advantage of this option is that it scales to external storage
|
||||||
|
when you require additional storage.
|
||||||
|
|
||||||
|
However, this option has several downsides:
|
||||||
|
|
||||||
|
* Running a distributed file system can make you lose your data
|
||||||
|
locality compared with nonshared storage.
|
||||||
|
* Recovery of instances is complicated by depending on multiple hosts.
|
||||||
|
* The chassis size of the compute node can limit the number of spindles
|
||||||
|
able to be used in a compute node.
|
||||||
|
* Use of the network can decrease performance.
|
||||||
|
|
||||||
|
On Compute Node Storage—Nonshared File System
|
||||||
|
---------------------------------------------
|
||||||
|
|
||||||
|
In this option, each compute node is specified with enough disks to
|
||||||
|
store the instances it hosts.
|
||||||
|
|
||||||
|
There are two main reasons why this is a good idea:
|
||||||
|
|
||||||
|
* Heavy I/O usage on one compute node does not affect instances on
|
||||||
|
other compute nodes.
|
||||||
|
* Direct I/O access can increase performance.
|
||||||
|
|
||||||
|
This has several downsides:
|
||||||
|
|
||||||
|
* If a compute node fails, the instances running on that node are lost.
|
||||||
|
* The chassis size of the compute node can limit the number of spindles
|
||||||
|
able to be used in a compute node.
|
||||||
|
* Migrations of instances from one node to another are more complicated
|
||||||
|
and rely on features that may not continue to be developed.
|
||||||
|
* If additional storage is required, this option does not scale.
|
||||||
|
|
||||||
|
Running a shared file system on a storage system apart from the computes
|
||||||
|
nodes is ideal for clouds where reliability and scalability are the most
|
||||||
|
important factors. Running a shared file system on the compute nodes
|
||||||
|
themselves may be best in a scenario where you have to deploy to
|
||||||
|
preexisting servers for which you have little to no control over their
|
||||||
|
specifications. Running a nonshared file system on the compute nodes
|
||||||
|
themselves is a good option for clouds with high I/O requirements and
|
||||||
|
low concern for reliability.
|
||||||
|
|
||||||
|
Issues with Live Migration
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
Live migration is an integral part of the operations of the
|
||||||
|
cloud. This feature provides the ability to seamlessly move instances
|
||||||
|
from one physical host to another, a necessity for performing upgrades
|
||||||
|
that require reboots of the compute hosts, but only works well with
|
||||||
|
shared storage.
|
||||||
|
|
||||||
|
Live migration can also be done with nonshared storage, using a feature
|
||||||
|
known as *KVM live block migration*. While an earlier implementation of
|
||||||
|
block-based migration in KVM and QEMU was considered unreliable, there
|
||||||
|
is a newer, more reliable implementation of block-based live migration
|
||||||
|
as of QEMU 1.4 and libvirt 1.0.2 that is also compatible with OpenStack.
|
||||||
|
|
||||||
|
Choice of File System
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
If you want to support shared-storage live migration, you need to
|
||||||
|
configure a distributed file system.
|
||||||
|
|
||||||
|
Possible options include:
|
||||||
|
|
||||||
|
* NFS (default for Linux)
|
||||||
|
* GlusterFS
|
||||||
|
* MooseFS
|
||||||
|
* Lustre
|
||||||
|
|
||||||
|
We recommend that you choose the option operators are most familiar with.
|
||||||
|
NFS is the easiest to set up and there is extensive community knowledge
|
||||||
|
about it.
|
||||||
|
|
||||||
|
Overcommitting
|
||||||
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
OpenStack allows you to overcommit CPU and RAM on compute nodes. This
|
||||||
|
allows you to increase the number of instances you can have running on
|
||||||
|
your cloud, at the cost of reducing the performance of the instances.
|
||||||
|
OpenStack Compute uses the following ratios by default:
|
||||||
|
|
||||||
|
* CPU allocation ratio: 16:1
|
||||||
|
* RAM allocation ratio: 1.5:1
|
||||||
|
|
||||||
|
The default CPU allocation ratio of 16:1 means that the scheduler
|
||||||
|
allocates up to 16 virtual cores per physical core. For example, if a
|
||||||
|
physical node has 12 cores, the scheduler sees 192 available virtual
|
||||||
|
cores. With typical flavor definitions of 4 virtual cores per instance,
|
||||||
|
this ratio would provide 48 instances on a physical node.
|
||||||
|
|
||||||
|
The formula for the number of virtual instances on a compute node is
|
||||||
|
``(OR*PC)/VC``, where:
|
||||||
|
|
||||||
|
OR
|
||||||
|
CPU overcommit ratio (virtual cores per physical core)
|
||||||
|
|
||||||
|
PC
|
||||||
|
Number of physical cores
|
||||||
|
|
||||||
|
VC
|
||||||
|
Number of virtual cores per instance
|
||||||
|
|
||||||
|
Similarly, the default RAM allocation ratio of 1.5:1 means that the
|
||||||
|
scheduler allocates instances to a physical node as long as the total
|
||||||
|
amount of RAM associated with the instances is less than 1.5 times the
|
||||||
|
amount of RAM available on the physical node.
|
||||||
|
|
||||||
|
For example, if a physical node has 48 GB of RAM, the scheduler
|
||||||
|
allocates instances to that node until the sum of the RAM associated
|
||||||
|
with the instances reaches 72 GB (such as nine instances, in the case
|
||||||
|
where each instance has 8 GB of RAM).
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Regardless of the overcommit ratio, an instance can not be placed
|
||||||
|
on any physical node with fewer raw (pre-overcommit) resources than
|
||||||
|
the instance flavor requires.
|
||||||
|
|
||||||
|
You must select the appropriate CPU and RAM allocation ratio for your
|
||||||
|
particular use case.
|
||||||
|
|
||||||
|
Logging
|
||||||
|
~~~~~~~
|
||||||
|
|
||||||
|
Logging is described in more detail in `Logging and Monitoring
|
||||||
|
<http://docs.openstack.org/ops-guide/ops_logging_monitoring.html>`_. However,
|
||||||
|
it is an important design consideration to take into account before
|
||||||
|
commencing operations of your cloud.
|
||||||
|
|
||||||
|
OpenStack produces a great deal of useful logging information, however;
|
||||||
|
but for the information to be useful for operations purposes, you should
|
||||||
|
consider having a central logging server to send logs to, and a log
|
||||||
|
parsing/analysis system (such as logstash).
|
||||||
|
|
||||||
|
Networking
|
||||||
|
~~~~~~~~~~
|
||||||
|
|
||||||
|
Networking in OpenStack is a complex, multifaceted challenge. See
|
||||||
|
:doc:`design-networking`.
|
||||||
|
@ -1,3 +1,412 @@
|
|||||||
=============
|
=============
|
||||||
Control Plane
|
Control Plane
|
||||||
=============
|
=============
|
||||||
|
|
||||||
|
.. From Ops Guide chapter: Designing for Cloud Controllers and Cloud
|
||||||
|
Management
|
||||||
|
|
||||||
|
OpenStack is designed to be massively horizontally scalable, which
|
||||||
|
allows all services to be distributed widely. However, to simplify this
|
||||||
|
guide, we have decided to discuss services of a more central nature,
|
||||||
|
using the concept of a *cloud controller*. A cloud controller is a
|
||||||
|
conceptual simplification. In the real world, you design an architecture
|
||||||
|
for your cloud controller that enables high availability so that if any
|
||||||
|
node fails, another can take over the required tasks. In reality, cloud
|
||||||
|
controller tasks are spread out across more than a single node.
|
||||||
|
|
||||||
|
The cloud controller provides the central management system for
|
||||||
|
OpenStack deployments. Typically, the cloud controller manages
|
||||||
|
authentication and sends messaging to all the systems through a message
|
||||||
|
queue.
|
||||||
|
|
||||||
|
For many deployments, the cloud controller is a single node. However, to
|
||||||
|
have high availability, you have to take a few considerations into
|
||||||
|
account, which we'll cover in this chapter.
|
||||||
|
|
||||||
|
The cloud controller manages the following services for the cloud:
|
||||||
|
|
||||||
|
Databases
|
||||||
|
Tracks current information about users and instances, for example,
|
||||||
|
in a database, typically one database instance managed per service
|
||||||
|
|
||||||
|
Message queue services
|
||||||
|
All :term:`Advanced Message Queuing Protocol (AMQP)` messages for
|
||||||
|
services are received and sent according to the queue broker
|
||||||
|
|
||||||
|
Conductor services
|
||||||
|
Proxy requests to a database
|
||||||
|
|
||||||
|
Authentication and authorization for identity management
|
||||||
|
Indicates which users can do what actions on certain cloud
|
||||||
|
resources; quota management is spread out among services,
|
||||||
|
howeverauthentication
|
||||||
|
|
||||||
|
Image-management services
|
||||||
|
Stores and serves images with metadata on each, for launching in the
|
||||||
|
cloud
|
||||||
|
|
||||||
|
Scheduling services
|
||||||
|
Indicates which resources to use first; for example, spreading out
|
||||||
|
where instances are launched based on an algorithm
|
||||||
|
|
||||||
|
User dashboard
|
||||||
|
Provides a web-based front end for users to consume OpenStack cloud
|
||||||
|
services
|
||||||
|
|
||||||
|
API endpoints
|
||||||
|
Offers each service's REST API access, where the API endpoint
|
||||||
|
catalog is managed by the Identity service
|
||||||
|
|
||||||
|
For our example, the cloud controller has a collection of ``nova-*``
|
||||||
|
components that represent the global state of the cloud; talks to
|
||||||
|
services such as authentication; maintains information about the cloud
|
||||||
|
in a database; communicates to all compute nodes and storage
|
||||||
|
:term:`workers <worker>` through a queue; and provides API access.
|
||||||
|
Each service running on a designated cloud controller may be broken out
|
||||||
|
into separate nodes for scalability or availability.
|
||||||
|
|
||||||
|
As another example, you could use pairs of servers for a collective
|
||||||
|
cloud controller—one active, one standby—for redundant nodes providing a
|
||||||
|
given set of related services, such as:
|
||||||
|
|
||||||
|
- Front end web for API requests, the scheduler for choosing which
|
||||||
|
compute node to boot an instance on, Identity services, and the
|
||||||
|
dashboard
|
||||||
|
|
||||||
|
- Database and message queue server (such as MySQL, RabbitMQ)
|
||||||
|
|
||||||
|
- Image service for the image management
|
||||||
|
|
||||||
|
Now that you see the myriad designs for controlling your cloud, read
|
||||||
|
more about the further considerations to help with your design
|
||||||
|
decisions.
|
||||||
|
|
||||||
|
Hardware Considerations
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
A cloud controller's hardware can be the same as a compute node, though
|
||||||
|
you may want to further specify based on the size and type of cloud that
|
||||||
|
you run.
|
||||||
|
|
||||||
|
It's also possible to use virtual machines for all or some of the
|
||||||
|
services that the cloud controller manages, such as the message queuing.
|
||||||
|
In this guide, we assume that all services are running directly on the
|
||||||
|
cloud controller.
|
||||||
|
|
||||||
|
:ref:`table_controller_hardware` contains common considerations to
|
||||||
|
review when sizing hardware for the cloud controller design.
|
||||||
|
|
||||||
|
.. _table_controller_hardware:
|
||||||
|
|
||||||
|
.. list-table:: Table. Cloud controller hardware sizing considerations
|
||||||
|
:widths: 25 75
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - Consideration
|
||||||
|
- Ramification
|
||||||
|
* - How many instances will run at once?
|
||||||
|
- Size your database server accordingly, and scale out beyond one cloud
|
||||||
|
controller if many instances will report status at the same time and
|
||||||
|
scheduling where a new instance starts up needs computing power.
|
||||||
|
* - How many compute nodes will run at once?
|
||||||
|
- Ensure that your messaging queue handles requests successfully and size
|
||||||
|
accordingly.
|
||||||
|
* - How many users will access the API?
|
||||||
|
- If many users will make multiple requests, make sure that the CPU load
|
||||||
|
for the cloud controller can handle it.
|
||||||
|
* - How many users will access the dashboard versus the REST API directly?
|
||||||
|
- The dashboard makes many requests, even more than the API access, so
|
||||||
|
add even more CPU if your dashboard is the main interface for your users.
|
||||||
|
* - How many ``nova-api`` services do you run at once for your cloud?
|
||||||
|
- You need to size the controller with a core per service.
|
||||||
|
* - How long does a single instance run?
|
||||||
|
- Starting instances and deleting instances is demanding on the compute
|
||||||
|
node but also demanding on the controller node because of all the API
|
||||||
|
queries and scheduling needs.
|
||||||
|
* - Does your authentication system also verify externally?
|
||||||
|
- External systems such as :term:`LDAP <Lightweight Directory Access
|
||||||
|
Protocol (LDAP)>` or :term:`Active Directory` require network
|
||||||
|
connectivity between the cloud controller and an external authentication
|
||||||
|
system. Also ensure that the cloud controller has the CPU power to keep
|
||||||
|
up with requests.
|
||||||
|
|
||||||
|
|
||||||
|
Separation of Services
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
While our example contains all central services in a single location, it
|
||||||
|
is possible and indeed often a good idea to separate services onto
|
||||||
|
different physical servers. :ref:`table_deployment_scenarios` is a list
|
||||||
|
of deployment scenarios we've seen and their justifications.
|
||||||
|
|
||||||
|
.. _table_deployment_scenarios:
|
||||||
|
|
||||||
|
.. list-table:: Table. Deployment scenarios
|
||||||
|
:widths: 25 75
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - Scenario
|
||||||
|
- Justification
|
||||||
|
* - Run ``glance-*`` servers on the ``swift-proxy`` server.
|
||||||
|
- This deployment felt that the spare I/O on the Object Storage proxy
|
||||||
|
server was sufficient and that the Image Delivery portion of glance
|
||||||
|
benefited from being on physical hardware and having good connectivity
|
||||||
|
to the Object Storage back end it was using.
|
||||||
|
* - Run a central dedicated database server.
|
||||||
|
- This deployment used a central dedicated server to provide the databases
|
||||||
|
for all services. This approach simplified operations by isolating
|
||||||
|
database server updates and allowed for the simple creation of slave
|
||||||
|
database servers for failover.
|
||||||
|
* - Run one VM per service.
|
||||||
|
- This deployment ran central services on a set of servers running KVM.
|
||||||
|
A dedicated VM was created for each service (``nova-scheduler``,
|
||||||
|
rabbitmq, database, etc). This assisted the deployment with scaling
|
||||||
|
because administrators could tune the resources given to each virtual
|
||||||
|
machine based on the load it received (something that was not well
|
||||||
|
understood during installation).
|
||||||
|
* - Use an external load balancer.
|
||||||
|
- This deployment had an expensive hardware load balancer in its
|
||||||
|
organization. It ran multiple ``nova-api`` and ``swift-proxy``
|
||||||
|
servers on different physical servers and used the load balancer
|
||||||
|
to switch between them.
|
||||||
|
|
||||||
|
One choice that always comes up is whether to virtualize. Some services,
|
||||||
|
such as ``nova-compute``, ``swift-proxy`` and ``swift-object`` servers,
|
||||||
|
should not be virtualized. However, control servers can often be happily
|
||||||
|
virtualized—the performance penalty can usually be offset by simply
|
||||||
|
running more of the service.
|
||||||
|
|
||||||
|
Database
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
OpenStack Compute uses an SQL database to store and retrieve stateful
|
||||||
|
information. MySQL is the popular database choice in the OpenStack
|
||||||
|
community.
|
||||||
|
|
||||||
|
Loss of the database leads to errors. As a result, we recommend that you
|
||||||
|
cluster your database to make it failure tolerant. Configuring and
|
||||||
|
maintaining a database cluster is done outside OpenStack and is
|
||||||
|
determined by the database software you choose to use in your cloud
|
||||||
|
environment. MySQL/Galera is a popular option for MySQL-based databases.
|
||||||
|
|
||||||
|
Message Queue
|
||||||
|
~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Most OpenStack services communicate with each other using the *message
|
||||||
|
queue*. For example, Compute communicates to block storage services and
|
||||||
|
networking services through the message queue. Also, you can optionally
|
||||||
|
enable notifications for any service. RabbitMQ, Qpid, and Zeromq are all
|
||||||
|
popular choices for a message-queue service. In general, if the message
|
||||||
|
queue fails or becomes inaccessible, the cluster grinds to a halt and
|
||||||
|
ends up in a read-only state, with information stuck at the point where
|
||||||
|
the last message was sent. Accordingly, we recommend that you cluster
|
||||||
|
the message queue. Be aware that clustered message queues can be a pain
|
||||||
|
point for many OpenStack deployments. While RabbitMQ has native
|
||||||
|
clustering support, there have been reports of issues when running it at
|
||||||
|
a large scale. While other queuing solutions are available, such as Zeromq
|
||||||
|
and Qpid, Zeromq does not offer stateful queues. Qpid is the messaging
|
||||||
|
system of choice for Red Hat and its derivatives. Qpid does not have
|
||||||
|
native clustering capabilities and requires a supplemental service, such
|
||||||
|
as Pacemaker or Corsync. For your message queue, you need to determine
|
||||||
|
what level of data loss you are comfortable with and whether to use an
|
||||||
|
OpenStack project's ability to retry multiple MQ hosts in the event of a
|
||||||
|
failure, such as using Compute's ability to do so.
|
||||||
|
|
||||||
|
Conductor Services
|
||||||
|
~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
In the previous version of OpenStack, all ``nova-compute`` services
|
||||||
|
required direct access to the database hosted on the cloud controller.
|
||||||
|
This was problematic for two reasons: security and performance. With
|
||||||
|
regard to security, if a compute node is compromised, the attacker
|
||||||
|
inherently has access to the database. With regard to performance,
|
||||||
|
``nova-compute`` calls to the database are single-threaded and blocking.
|
||||||
|
This creates a performance bottleneck because database requests are
|
||||||
|
fulfilled serially rather than in parallel.
|
||||||
|
|
||||||
|
The conductor service resolves both of these issues by acting as a proxy
|
||||||
|
for the ``nova-compute`` service. Now, instead of ``nova-compute``
|
||||||
|
directly accessing the database, it contacts the ``nova-conductor``
|
||||||
|
service, and ``nova-conductor`` accesses the database on
|
||||||
|
``nova-compute``'s behalf. Since ``nova-compute`` no longer has direct
|
||||||
|
access to the database, the security issue is resolved. Additionally,
|
||||||
|
``nova-conductor`` is a nonblocking service, so requests from all
|
||||||
|
compute nodes are fulfilled in parallel.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
If you are using ``nova-network`` and multi-host networking in your
|
||||||
|
cloud environment, ``nova-compute`` still requires direct access to
|
||||||
|
the database.
|
||||||
|
|
||||||
|
The ``nova-conductor`` service is horizontally scalable. To make
|
||||||
|
``nova-conductor`` highly available and fault tolerant, just launch more
|
||||||
|
instances of the ``nova-conductor`` process, either on the same server
|
||||||
|
or across multiple servers.
|
||||||
|
|
||||||
|
Application Programming Interface (API)
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
All public access, whether direct, through a command-line client, or
|
||||||
|
through the web-based dashboard, uses the API service. Find the API
|
||||||
|
reference at http://developer.openstack.org/.
|
||||||
|
|
||||||
|
You must choose whether you want to support the Amazon EC2 compatibility
|
||||||
|
APIs, or just the OpenStack APIs. One issue you might encounter when
|
||||||
|
running both APIs is an inconsistent experience when referring to images
|
||||||
|
and instances.
|
||||||
|
|
||||||
|
For example, the EC2 API refers to instances using IDs that contain
|
||||||
|
hexadecimal, whereas the OpenStack API uses names and digits. Similarly,
|
||||||
|
the EC2 API tends to rely on DNS aliases for contacting virtual
|
||||||
|
machines, as opposed to OpenStack, which typically lists IP
|
||||||
|
addresses.
|
||||||
|
|
||||||
|
If OpenStack is not set up in the right way, it is simple to have
|
||||||
|
scenarios in which users are unable to contact their instances due to
|
||||||
|
having only an incorrect DNS alias. Despite this, EC2 compatibility can
|
||||||
|
assist users migrating to your cloud.
|
||||||
|
|
||||||
|
As with databases and message queues, having more than one :term:`API server`
|
||||||
|
is a good thing. Traditional HTTP load-balancing techniques can be used to
|
||||||
|
achieve a highly available ``nova-api`` service.
|
||||||
|
|
||||||
|
Extensions
|
||||||
|
~~~~~~~~~~
|
||||||
|
|
||||||
|
The `API
|
||||||
|
Specifications <http://docs.openstack.org/api/api-specs.html>`_ define
|
||||||
|
the core actions, capabilities, and mediatypes of the OpenStack API. A
|
||||||
|
client can always depend on the availability of this core API, and
|
||||||
|
implementers are always required to support it in its entirety.
|
||||||
|
Requiring strict adherence to the core API allows clients to rely upon a
|
||||||
|
minimal level of functionality when interacting with multiple
|
||||||
|
implementations of the same API.
|
||||||
|
|
||||||
|
The OpenStack Compute API is extensible. An extension adds capabilities
|
||||||
|
to an API beyond those defined in the core. The introduction of new
|
||||||
|
features, MIME types, actions, states, headers, parameters, and
|
||||||
|
resources can all be accomplished by means of extensions to the core
|
||||||
|
API. This allows the introduction of new features in the API without
|
||||||
|
requiring a version change and allows the introduction of
|
||||||
|
vendor-specific niche functionality.
|
||||||
|
|
||||||
|
Scheduling
|
||||||
|
~~~~~~~~~~
|
||||||
|
|
||||||
|
The scheduling services are responsible for determining the compute or
|
||||||
|
storage node where a virtual machine or block storage volume should be
|
||||||
|
created. The scheduling services receive creation requests for these
|
||||||
|
resources from the message queue and then begin the process of
|
||||||
|
determining the appropriate node where the resource should reside. This
|
||||||
|
process is done by applying a series of user-configurable filters
|
||||||
|
against the available collection of nodes.
|
||||||
|
|
||||||
|
There are currently two schedulers: ``nova-scheduler`` for virtual
|
||||||
|
machines and ``cinder-scheduler`` for block storage volumes. Both
|
||||||
|
schedulers are able to scale horizontally, so for high-availability
|
||||||
|
purposes, or for very large or high-schedule-frequency installations,
|
||||||
|
you should consider running multiple instances of each scheduler. The
|
||||||
|
schedulers all listen to the shared message queue, so no special load
|
||||||
|
balancing is required.
|
||||||
|
|
||||||
|
Images
|
||||||
|
~~~~~~
|
||||||
|
|
||||||
|
The OpenStack Image service consists of two parts: ``glance-api`` and
|
||||||
|
``glance-registry``. The former is responsible for the delivery of
|
||||||
|
images; the compute node uses it to download images from the back end.
|
||||||
|
The latter maintains the metadata information associated with virtual
|
||||||
|
machine images and requires a database.
|
||||||
|
|
||||||
|
The ``glance-api`` part is an abstraction layer that allows a choice of
|
||||||
|
back end. Currently, it supports:
|
||||||
|
|
||||||
|
OpenStack Object Storage
|
||||||
|
Allows you to store images as objects.
|
||||||
|
|
||||||
|
File system
|
||||||
|
Uses any traditional file system to store the images as files.
|
||||||
|
|
||||||
|
S3
|
||||||
|
Allows you to fetch images from Amazon S3.
|
||||||
|
|
||||||
|
HTTP
|
||||||
|
Allows you to fetch images from a web server. You cannot write
|
||||||
|
images by using this mode.
|
||||||
|
|
||||||
|
If you have an OpenStack Object Storage service, we recommend using this
|
||||||
|
as a scalable place to store your images. You can also use a file system
|
||||||
|
with sufficient performance or Amazon S3—unless you do not need the
|
||||||
|
ability to upload new images through OpenStack.
|
||||||
|
|
||||||
|
Dashboard
|
||||||
|
~~~~~~~~~
|
||||||
|
|
||||||
|
The OpenStack dashboard (horizon) provides a web-based user interface to
|
||||||
|
the various OpenStack components. The dashboard includes an end-user
|
||||||
|
area for users to manage their virtual infrastructure and an admin area
|
||||||
|
for cloud operators to manage the OpenStack environment as a
|
||||||
|
whole.
|
||||||
|
|
||||||
|
The dashboard is implemented as a Python web application that normally
|
||||||
|
runs in :term:`Apache` ``httpd``. Therefore, you may treat it the same as any
|
||||||
|
other web application, provided it can reach the API servers (including
|
||||||
|
their admin endpoints) over the network.
|
||||||
|
|
||||||
|
Authentication and Authorization
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The concepts supporting OpenStack's authentication and authorization are
|
||||||
|
derived from well-understood and widely used systems of a similar
|
||||||
|
nature. Users have credentials they can use to authenticate, and they
|
||||||
|
can be a member of one or more groups (known as projects or tenants,
|
||||||
|
interchangeably).
|
||||||
|
|
||||||
|
For example, a cloud administrator might be able to list all instances
|
||||||
|
in the cloud, whereas a user can see only those in his current group.
|
||||||
|
Resources quotas, such as the number of cores that can be used, disk
|
||||||
|
space, and so on, are associated with a project.
|
||||||
|
|
||||||
|
OpenStack Identity provides authentication decisions and user attribute
|
||||||
|
information, which is then used by the other OpenStack services to
|
||||||
|
perform authorization. The policy is set in the ``policy.json`` file.
|
||||||
|
For information on how to configure these, see `Managing Projects and Users
|
||||||
|
<http://docs.openstack.org/ops-guide/ops_projects_users.html>`_ in the
|
||||||
|
OpenStack Operations Guide.
|
||||||
|
|
||||||
|
OpenStack Identity supports different plug-ins for authentication
|
||||||
|
decisions and identity storage. Examples of these plug-ins include:
|
||||||
|
|
||||||
|
- In-memory key-value Store (a simplified internal storage structure)
|
||||||
|
|
||||||
|
- SQL database (such as MySQL or PostgreSQL)
|
||||||
|
|
||||||
|
- Memcached (a distributed memory object caching system)
|
||||||
|
|
||||||
|
- LDAP (such as OpenLDAP or Microsoft's Active Directory)
|
||||||
|
|
||||||
|
Many deployments use the SQL database; however, LDAP is also a popular
|
||||||
|
choice for those with existing authentication infrastructure that needs
|
||||||
|
to be integrated.
|
||||||
|
|
||||||
|
Network Considerations
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Because the cloud controller handles so many different services, it must
|
||||||
|
be able to handle the amount of traffic that hits it. For example, if
|
||||||
|
you choose to host the OpenStack Image service on the cloud controller,
|
||||||
|
the cloud controller should be able to support the transferring of the
|
||||||
|
images at an acceptable speed.
|
||||||
|
|
||||||
|
As another example, if you choose to use single-host networking where
|
||||||
|
the cloud controller is the network gateway for all instances, then the
|
||||||
|
cloud controller must support the total amount of traffic that travels
|
||||||
|
between your cloud and the public Internet.
|
||||||
|
|
||||||
|
We recommend that you use a fast NIC, such as 10 GB. You can also choose
|
||||||
|
to use two 10 GB NICs and bond them together. While you might not be
|
||||||
|
able to get a full bonded 20 GB speed, different transmission streams
|
||||||
|
use different NICs. For example, if the cloud controller transfers two
|
||||||
|
images, each image uses a different NIC and gets a full 10 GB of
|
||||||
|
bandwidth.
|
||||||
|
@ -1,3 +1,282 @@
|
|||||||
==========
|
==========
|
||||||
Networking
|
Networking
|
||||||
==========
|
==========
|
||||||
|
|
||||||
|
OpenStack provides a rich networking environment. This chapter
|
||||||
|
details the requirements and options to consider when designing your
|
||||||
|
cloud. This includes examples of network implementations to
|
||||||
|
consider, information about some OpenStack network layouts and networking
|
||||||
|
services that are essential for stable operation.
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
If this is the first time you are deploying a cloud infrastructure
|
||||||
|
in your organization, your first conversations should be with your
|
||||||
|
networking team. Network usage in a running cloud is vastly different
|
||||||
|
from traditional network deployments and has the potential to be
|
||||||
|
disruptive at both a connectivity and a policy level.
|
||||||
|
|
||||||
|
For example, you must plan the number of IP addresses that you need for
|
||||||
|
both your guest instances as well as management infrastructure.
|
||||||
|
Additionally, you must research and discuss cloud network connectivity
|
||||||
|
through proxy servers and firewalls.
|
||||||
|
|
||||||
|
See the `OpenStack Security Guide <http://docs.openstack.org/sec/>`_ for tips
|
||||||
|
on securing your network.
|
||||||
|
|
||||||
|
Management Network
|
||||||
|
~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
A :term:`management network` (a separate network for use by your cloud
|
||||||
|
operators) typically consists of a separate switch and separate NICs
|
||||||
|
(network interface cards), and is a recommended option. This segregation
|
||||||
|
prevents system administration and the monitoring of system access from
|
||||||
|
being disrupted by traffic generated by guests.
|
||||||
|
|
||||||
|
Consider creating other private networks for communication between
|
||||||
|
internal components of OpenStack, such as the message queue and
|
||||||
|
OpenStack Compute. Using a virtual local area network (VLAN) works well
|
||||||
|
for these scenarios because it provides a method for creating multiple
|
||||||
|
virtual networks on a physical network.
|
||||||
|
|
||||||
|
Public Addressing Options
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
There are two main types of IP addresses for guest virtual machines:
|
||||||
|
fixed IPs and floating IPs. Fixed IPs are assigned to instances on boot,
|
||||||
|
whereas floating IP addresses can change their association between
|
||||||
|
instances by action of the user. Both types of IP addresses can be
|
||||||
|
either public or private, depending on your use case.
|
||||||
|
|
||||||
|
Fixed IP addresses are required, whereas it is possible to run OpenStack
|
||||||
|
without floating IPs. One of the most common use cases for floating IPs
|
||||||
|
is to provide public IP addresses to a private cloud, where there are a
|
||||||
|
limited number of IP addresses available. Another is for a public cloud
|
||||||
|
user to have a "static" IP address that can be reassigned when an
|
||||||
|
instance is upgraded or moved.
|
||||||
|
|
||||||
|
Fixed IP addresses can be private for private clouds, or public for
|
||||||
|
public clouds. When an instance terminates, its fixed IP is lost. It is
|
||||||
|
worth noting that newer users of cloud computing may find their
|
||||||
|
ephemeral nature frustrating.
|
||||||
|
|
||||||
|
IP Address Planning
|
||||||
|
~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
An OpenStack installation can potentially have many subnets (ranges of
|
||||||
|
IP addresses) and different types of services in each. An IP address
|
||||||
|
plan can assist with a shared understanding of network partition
|
||||||
|
purposes and scalability. Control services can have public and private
|
||||||
|
IP addresses, and as noted above, there are a couple of options for an
|
||||||
|
instance's public addresses.
|
||||||
|
|
||||||
|
An IP address plan might be broken down into the following sections:
|
||||||
|
|
||||||
|
Subnet router
|
||||||
|
Packets leaving the subnet go via this address, which could be a
|
||||||
|
dedicated router or a ``nova-network`` service.
|
||||||
|
|
||||||
|
Control services public interfaces
|
||||||
|
Public access to ``swift-proxy``, ``nova-api``, ``glance-api``, and
|
||||||
|
horizon come to these addresses, which could be on one side of a
|
||||||
|
load balancer or pointing at individual machines.
|
||||||
|
|
||||||
|
Object Storage cluster internal communications
|
||||||
|
Traffic among object/account/container servers and between these and
|
||||||
|
the proxy server's internal interface uses this private network.
|
||||||
|
|
||||||
|
Compute and storage communications
|
||||||
|
If ephemeral or block storage is external to the compute node, this
|
||||||
|
network is used.
|
||||||
|
|
||||||
|
Out-of-band remote management
|
||||||
|
If a dedicated remote access controller chip is included in servers,
|
||||||
|
often these are on a separate network.
|
||||||
|
|
||||||
|
In-band remote management
|
||||||
|
Often, an extra (such as 1 GB) interface on compute or storage nodes
|
||||||
|
is used for system administrators or monitoring tools to access the
|
||||||
|
host instead of going through the public interface.
|
||||||
|
|
||||||
|
Spare space for future growth
|
||||||
|
Adding more public-facing control services or guest instance IPs
|
||||||
|
should always be part of your plan.
|
||||||
|
|
||||||
|
For example, take a deployment that has both OpenStack Compute and
|
||||||
|
Object Storage, with private ranges 172.22.42.0/24 and 172.22.87.0/26
|
||||||
|
available. One way to segregate the space might be as follows:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
172.22.42.0/24:
|
||||||
|
172.22.42.1 - 172.22.42.3 - subnet routers
|
||||||
|
172.22.42.4 - 172.22.42.20 - spare for networks
|
||||||
|
172.22.42.21 - 172.22.42.104 - Compute node remote access controllers
|
||||||
|
(inc spare)
|
||||||
|
172.22.42.105 - 172.22.42.188 - Compute node management interfaces (inc spare)
|
||||||
|
172.22.42.189 - 172.22.42.208 - Swift proxy remote access controllers
|
||||||
|
(inc spare)
|
||||||
|
172.22.42.209 - 172.22.42.228 - Swift proxy management interfaces (inc spare)
|
||||||
|
172.22.42.229 - 172.22.42.252 - Swift storage servers remote access controllers
|
||||||
|
(inc spare)
|
||||||
|
172.22.42.253 - 172.22.42.254 - spare
|
||||||
|
172.22.87.0/26:
|
||||||
|
172.22.87.1 - 172.22.87.3 - subnet routers
|
||||||
|
172.22.87.4 - 172.22.87.24 - Swift proxy server internal interfaces
|
||||||
|
(inc spare)
|
||||||
|
172.22.87.25 - 172.22.87.63 - Swift object server internal interfaces
|
||||||
|
(inc spare)
|
||||||
|
|
||||||
|
A similar approach can be taken with public IP addresses, taking note
|
||||||
|
that large, flat ranges are preferred for use with guest instance IPs.
|
||||||
|
Take into account that for some OpenStack networking options, a public
|
||||||
|
IP address in the range of a guest instance public IP address is
|
||||||
|
assigned to the ``nova-compute`` host.
|
||||||
|
|
||||||
|
Network Topology
|
||||||
|
~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
OpenStack Compute with ``nova-network`` provides predefined network
|
||||||
|
deployment models, each with its own strengths and weaknesses. The
|
||||||
|
selection of a network manager changes your network topology, so the
|
||||||
|
choice should be made carefully. You also have a choice between the
|
||||||
|
tried-and-true legacy ``nova-network`` settings or the neutron project
|
||||||
|
for OpenStack Networking. Both offer networking for launched instances
|
||||||
|
with different implementations and requirements.
|
||||||
|
|
||||||
|
For OpenStack Networking with the neutron project, typical
|
||||||
|
configurations are documented with the idea that any setup you can
|
||||||
|
configure with real hardware you can re-create with a software-defined
|
||||||
|
equivalent. Each tenant can contain typical network elements such as
|
||||||
|
routers, and services such as :term:`DHCP`.
|
||||||
|
|
||||||
|
:ref:`table_networking_deployment` describes the networking deployment
|
||||||
|
options for both legacy ``nova-network`` options and an equivalent
|
||||||
|
neutron configuration.
|
||||||
|
|
||||||
|
.. _table_networking_deployment:
|
||||||
|
|
||||||
|
.. list-table:: Networking deployment options
|
||||||
|
:widths: 10 30 30 30
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - Network deployment model
|
||||||
|
- Strengths
|
||||||
|
- Weaknesses
|
||||||
|
- Neutron equivalent
|
||||||
|
* - Flat
|
||||||
|
- Extremely simple topology. No DHCP overhead.
|
||||||
|
- Requires file injection into the instance to configure network
|
||||||
|
interfaces.
|
||||||
|
- Configure a single bridge as the integration bridge (br-int) and
|
||||||
|
connect it to a physical network interface with the Modular Layer 2
|
||||||
|
(ML2) plug-in, which uses Open vSwitch by default.
|
||||||
|
* - FlatDHCP
|
||||||
|
- Relatively simple to deploy. Standard networking. Works with all guest
|
||||||
|
operating systems.
|
||||||
|
- Requires its own DHCP broadcast domain.
|
||||||
|
- Configure DHCP agents and routing agents. Network Address Translation
|
||||||
|
(NAT) performed outside of compute nodes, typically on one or more
|
||||||
|
network nodes.
|
||||||
|
* - VlanManager
|
||||||
|
- Each tenant is isolated to its own VLANs.
|
||||||
|
- More complex to set up. Requires its own DHCP broadcast domain.
|
||||||
|
Requires many VLANs to be trunked onto a single port. Standard VLAN
|
||||||
|
number limitation. Switches must support 802.1q VLAN tagging.
|
||||||
|
- Isolated tenant networks implement some form of isolation of layer 2
|
||||||
|
traffic between distinct networks. VLAN tagging is key concept, where
|
||||||
|
traffic is “tagged” with an ordinal identifier for the VLAN. Isolated
|
||||||
|
network implementations may or may not include additional services like
|
||||||
|
DHCP, NAT, and routing.
|
||||||
|
* - FlatDHCP Multi-host with high availability (HA)
|
||||||
|
- Networking failure is isolated to the VMs running on the affected
|
||||||
|
hypervisor. DHCP traffic can be isolated within an individual host.
|
||||||
|
Network traffic is distributed to the compute nodes.
|
||||||
|
- More complex to set up. Compute nodes typically need IP addresses
|
||||||
|
accessible by external networks. Options must be carefully configured
|
||||||
|
for live migration to work with networking services.
|
||||||
|
- Configure neutron with multiple DHCP and layer-3 agents. Network nodes
|
||||||
|
are not able to failover to each other, so the controller runs
|
||||||
|
networking services, such as DHCP. Compute nodes run the ML2 plug-in
|
||||||
|
with support for agents such as Open vSwitch or Linux Bridge.
|
||||||
|
|
||||||
|
Both ``nova-network`` and neutron services provide similar capabilities,
|
||||||
|
such as VLAN between VMs. You also can provide multiple NICs on VMs with
|
||||||
|
either service. Further discussion follows.
|
||||||
|
|
||||||
|
VLAN Configuration Within OpenStack VMs
|
||||||
|
---------------------------------------
|
||||||
|
|
||||||
|
VLAN configuration can be as simple or as complicated as desired. The
|
||||||
|
use of VLANs has the benefit of allowing each project its own subnet and
|
||||||
|
broadcast segregation from other projects. To allow OpenStack to
|
||||||
|
efficiently use VLANs, you must allocate a VLAN range (one for each
|
||||||
|
project) and turn each compute node switch port into a trunk
|
||||||
|
port.
|
||||||
|
|
||||||
|
For example, if you estimate that your cloud must support a maximum of
|
||||||
|
100 projects, pick a free VLAN range that your network infrastructure is
|
||||||
|
currently not using (such as VLAN 200–299). You must configure OpenStack
|
||||||
|
with this range and also configure your switch ports to allow VLAN
|
||||||
|
traffic from that range.
|
||||||
|
|
||||||
|
Multi-NIC Provisioning
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
OpenStack Networking with ``neutron`` and OpenStack Compute with
|
||||||
|
``nova-network`` have the ability to assign multiple NICs to instances. For
|
||||||
|
``nova-network`` this can be done on a per-request basis, with each
|
||||||
|
additional NIC using up an entire subnet or VLAN, reducing the total
|
||||||
|
number of supported projects.
|
||||||
|
|
||||||
|
Multi-Host and Single-Host Networking
|
||||||
|
-------------------------------------
|
||||||
|
|
||||||
|
The ``nova-network`` service has the ability to operate in a multi-host
|
||||||
|
or single-host mode. Multi-host is when each compute node runs a copy of
|
||||||
|
``nova-network`` and the instances on that compute node use the compute
|
||||||
|
node as a gateway to the Internet. The compute nodes also host the
|
||||||
|
floating IPs and security groups for instances on that node. Single-host
|
||||||
|
is when a central server—for example, the cloud controller—runs the
|
||||||
|
``nova-network`` service. All compute nodes forward traffic from the
|
||||||
|
instances to the cloud controller. The cloud controller then forwards
|
||||||
|
traffic to the Internet. The cloud controller hosts the floating IPs and
|
||||||
|
security groups for all instances on all compute nodes in the
|
||||||
|
cloud.
|
||||||
|
|
||||||
|
There are benefits to both modes. Single-node has the downside of a
|
||||||
|
single point of failure. If the cloud controller is not available,
|
||||||
|
instances cannot communicate on the network. This is not true with
|
||||||
|
multi-host, but multi-host requires that each compute node has a public
|
||||||
|
IP address to communicate on the Internet. If you are not able to obtain
|
||||||
|
a significant block of public IP addresses, multi-host might not be an
|
||||||
|
option.
|
||||||
|
|
||||||
|
Services for Networking
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
OpenStack, like any network application, has a number of standard
|
||||||
|
services to consider, such as NTP and DNS.
|
||||||
|
|
||||||
|
NTP
|
||||||
|
---
|
||||||
|
|
||||||
|
Time synchronization is a critical element to ensure continued operation
|
||||||
|
of OpenStack components. Correct time is necessary to avoid errors in
|
||||||
|
instance scheduling, replication of objects in the object store, and
|
||||||
|
even matching log timestamps for debugging.
|
||||||
|
|
||||||
|
All servers running OpenStack components should be able to access an
|
||||||
|
appropriate NTP server. You may decide to set up one locally or use the
|
||||||
|
public pools available from the `Network Time Protocol
|
||||||
|
project <http://www.pool.ntp.org/>`_.
|
||||||
|
|
||||||
|
DNS
|
||||||
|
---
|
||||||
|
|
||||||
|
OpenStack does not currently provide DNS services, aside from the
|
||||||
|
dnsmasq daemon, which resides on ``nova-network`` hosts. You could
|
||||||
|
consider providing a dynamic DNS service to allow instances to update a
|
||||||
|
DNS entry with new IP addresses. You can also consider making a generic
|
||||||
|
forward and reverse DNS mapping for instances' IP addresses, such as
|
||||||
|
vm-203-0-113-123.example.com.
|
||||||
|
Loading…
Reference in New Issue
Block a user