Reference architecture: small cloud with trusted tenants
This document describes the design of a small bare metal cloud with flat networking and conductors as part of (HA) controller nodes. Also adds missing information about rescuing network to the common reference architecture guide. Change-Id: Ifd3bfcc89263cd9810cd5cfb459ffeeaf146caf7 Story: 2001745 Task: 12108
This commit is contained in:
parent
cf0b64c4c5
commit
1ffa7571d3
@ -17,6 +17,8 @@ not support trunk ports belonging to multiple networks.
|
|||||||
Concepts
|
Concepts
|
||||||
========
|
========
|
||||||
|
|
||||||
|
.. _network-interfaces:
|
||||||
|
|
||||||
Network interfaces
|
Network interfaces
|
||||||
------------------
|
------------------
|
||||||
|
|
||||||
|
@ -328,6 +328,8 @@ distribution that uses ``systemd``:
|
|||||||
ip_addr
|
ip_addr
|
||||||
iptables
|
iptables
|
||||||
|
|
||||||
|
.. _troubleshooting-stp:
|
||||||
|
|
||||||
DHCP during PXE or iPXE is inconsistent or unreliable
|
DHCP during PXE or iPXE is inconsistent or unreliable
|
||||||
=====================================================
|
=====================================================
|
||||||
|
|
||||||
|
@ -7,6 +7,8 @@ architectures.
|
|||||||
.. contents::
|
.. contents::
|
||||||
:local:
|
:local:
|
||||||
|
|
||||||
|
.. _refarch-common-components:
|
||||||
|
|
||||||
Components
|
Components
|
||||||
----------
|
----------
|
||||||
|
|
||||||
@ -46,7 +48,7 @@ components.
|
|||||||
|
|
||||||
.. warning::
|
.. warning::
|
||||||
The ``ironic-python-agent`` service is not intended to be used or executed
|
The ``ironic-python-agent`` service is not intended to be used or executed
|
||||||
anywhere other than a provisioning/cleaning ramdisk.
|
anywhere other than a provisioning/cleaning/rescue ramdisk.
|
||||||
|
|
||||||
Hardware and drivers
|
Hardware and drivers
|
||||||
--------------------
|
--------------------
|
||||||
@ -82,6 +84,8 @@ also supports :doc:`/admin/drivers/redfish`. Several vendors
|
|||||||
provide more specific drivers that usually provide additional capabilities.
|
provide more specific drivers that usually provide additional capabilities.
|
||||||
Check :doc:`/admin/drivers` to find the most suitable one.
|
Check :doc:`/admin/drivers` to find the most suitable one.
|
||||||
|
|
||||||
|
.. _refarch-common-boot:
|
||||||
|
|
||||||
Boot interface
|
Boot interface
|
||||||
~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
@ -182,6 +186,8 @@ node. See :ref:`local-boot-partition-images` for details.
|
|||||||
Currently, network boot is used by default. However, we plan on changing it
|
Currently, network boot is used by default. However, we plan on changing it
|
||||||
in the future, so it's safer to set the ``default_boot_option`` explicitly.
|
in the future, so it's safer to set the ``default_boot_option`` explicitly.
|
||||||
|
|
||||||
|
.. _refarch-common-networking:
|
||||||
|
|
||||||
Networking
|
Networking
|
||||||
----------
|
----------
|
||||||
|
|
||||||
@ -194,13 +200,20 @@ documentation. However, several considerations are common for all of them:
|
|||||||
not be accessible by end users, and should not have access to the internet.
|
not be accessible by end users, and should not have access to the internet.
|
||||||
|
|
||||||
* There has to be a *cleaning* network, which is used by nodes during
|
* There has to be a *cleaning* network, which is used by nodes during
|
||||||
the cleaning process. In the majority of cases, the same network should
|
the cleaning process.
|
||||||
be used for cleaning and provisioning for simplicity.
|
|
||||||
|
|
||||||
Unless noted otherwise, everything in these sections apply to both networks.
|
* There should be a *rescuing* network, which is used by nodes during
|
||||||
|
the rescue process. It can be skipped if the rescue process is not supported.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
In the majority of cases, the same network should be used for cleaning,
|
||||||
|
provisioning and rescue for simplicity.
|
||||||
|
|
||||||
|
Unless noted otherwise, everything in these sections apply to all three
|
||||||
|
networks.
|
||||||
|
|
||||||
* The baremetal nodes must have access to the Bare Metal API while connected
|
* The baremetal nodes must have access to the Bare Metal API while connected
|
||||||
to the provisioning/cleaning network.
|
to the provisioning/cleaning/rescuing network.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
Only two endpoints need to be exposed there::
|
Only two endpoints need to be exposed there::
|
||||||
@ -213,25 +226,32 @@ Unless noted otherwise, everything in these sections apply to both networks.
|
|||||||
|
|
||||||
* If the ``pxe`` boot interface (or any boot interface based on it) is used,
|
* If the ``pxe`` boot interface (or any boot interface based on it) is used,
|
||||||
then the baremetal nodes should have untagged (access mode) connectivity
|
then the baremetal nodes should have untagged (access mode) connectivity
|
||||||
to the provisioning/cleaning networks. It allows PXE firmware, which does not
|
to the provisioning/cleaning/rescuing networks. It allows PXE firmware, which
|
||||||
support VLANs, to communicate with the services required for provisioning.
|
does not support VLANs, to communicate with the services required
|
||||||
|
for provisioning.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
It depends on the *network interface* whether the Bare Metal service will
|
It depends on the *network interface* whether the Bare Metal service will
|
||||||
handle it automatically. Check the networking documentation for the
|
handle it automatically. Check the networking documentation for the
|
||||||
specific architecture.
|
specific architecture.
|
||||||
|
|
||||||
|
Sometimes it may be necessary to disable the spanning tree protocol delay on
|
||||||
|
the switch - see :ref:`troubleshooting-stp`.
|
||||||
|
|
||||||
* The Baremetal nodes need to have access to any services required for
|
* The Baremetal nodes need to have access to any services required for
|
||||||
provisioning/cleaning, while connected to the provisioning/cleaning network.
|
provisioning/cleaning/rescue, while connected to the
|
||||||
This may include:
|
provisioning/cleaning/rescuing network. This may include:
|
||||||
|
|
||||||
* a TFTP server for PXE boot and also an HTTP server when iPXE is enabled
|
* a TFTP server for PXE boot and also an HTTP server when iPXE is enabled
|
||||||
* either an HTTP server or the Object Storage service in case of the
|
* either an HTTP server or the Object Storage service in case of the
|
||||||
``direct`` deploy interface and some virtual media boot interfaces
|
``direct`` deploy interface and some virtual media boot interfaces
|
||||||
|
|
||||||
* The Baremetal Conductors need to have access to the booted baremetal nodes
|
* The Baremetal Conductors need to have access to the booted baremetal nodes
|
||||||
during provisioning/cleaning. A conductor communicates with an internal
|
during provisioning/cleaning/rescue. A conductor communicates with
|
||||||
API, provided by **ironic-python-agent**, to conduct actions on nodes.
|
an internal API, provided by **ironic-python-agent**, to conduct actions
|
||||||
|
on nodes.
|
||||||
|
|
||||||
|
.. _refarch-common-ha:
|
||||||
|
|
||||||
HA and Scalability
|
HA and Scalability
|
||||||
------------------
|
------------------
|
||||||
|
@ -10,3 +10,11 @@ to get better familiar with the concepts used in this guide.
|
|||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
|
|
||||||
common
|
common
|
||||||
|
|
||||||
|
Scenarios
|
||||||
|
---------
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 2
|
||||||
|
|
||||||
|
small-cloud-trusted-tenants
|
||||||
|
248
doc/source/install/refarch/small-cloud-trusted-tenants.rst
Normal file
248
doc/source/install/refarch/small-cloud-trusted-tenants.rst
Normal file
@ -0,0 +1,248 @@
|
|||||||
|
Small cloud with trusted tenants
|
||||||
|
================================
|
||||||
|
|
||||||
|
Story
|
||||||
|
-----
|
||||||
|
|
||||||
|
As an operator I would like to build a small cloud with both virtual and bare
|
||||||
|
metal instances or add bare metal provisioning to my existing small or medium
|
||||||
|
scale single-site OpenStack cloud. The expected number of bare metal machines
|
||||||
|
is less than 100, and the rate of provisioning and unprovisioning is expected
|
||||||
|
to be low. All users of my cloud are trusted by me to not conduct malicious
|
||||||
|
actions towards each other or the cloud infrastructure itself.
|
||||||
|
|
||||||
|
As a user I would like to occasionally provision bare metal instances through
|
||||||
|
the Compute API by selecting an appropriate Compute flavor. I would like
|
||||||
|
to be able to boot them from images provided by the Image service or from
|
||||||
|
volumes provided by the Volume service.
|
||||||
|
|
||||||
|
Components
|
||||||
|
----------
|
||||||
|
|
||||||
|
This architecture assumes `an OpenStack installation`_ with the following
|
||||||
|
components participating in the bare metal provisioning:
|
||||||
|
|
||||||
|
* The `Compute service`_ manages bare metal instances.
|
||||||
|
|
||||||
|
* The `Networking service`_ provides DHCP for bare metal instances.
|
||||||
|
|
||||||
|
* The `Image service`_ provides images for bare metal instances.
|
||||||
|
|
||||||
|
The following services can be optionally used by the Bare Metal service:
|
||||||
|
|
||||||
|
* The `Volume service`_ provides volumes to boot bare metal instances from.
|
||||||
|
|
||||||
|
* The `Bare Metal Introspection service`_ simplifies enrolling new bare metal
|
||||||
|
machines by conducting in-band introspection.
|
||||||
|
|
||||||
|
Node roles
|
||||||
|
----------
|
||||||
|
|
||||||
|
An OpenStack installation in this guide has at least these three types of
|
||||||
|
nodes:
|
||||||
|
|
||||||
|
* A *controller* node hosts the control plane services.
|
||||||
|
|
||||||
|
* A *compute* node runs the virtual machines and hosts a subset of Compute
|
||||||
|
and Networking components.
|
||||||
|
|
||||||
|
* A *block storage* node provides persistent storage space for both virtual
|
||||||
|
and bare metal nodes.
|
||||||
|
|
||||||
|
The *compute* and *block storage* nodes are configured as described in the
|
||||||
|
installation guides of the `Compute service`_ and the `Volume service`_
|
||||||
|
respectively. The *controller* nodes host the Bare Metal service components.
|
||||||
|
|
||||||
|
Networking
|
||||||
|
----------
|
||||||
|
|
||||||
|
The networking architecture will highly depend on the exact operating
|
||||||
|
requirements. This guide expects the following existing networks:
|
||||||
|
*control plane*, *storage* and *public*. Additionally, two more networks
|
||||||
|
will be needed specifically for bare metal provisioning: *bare metal* and
|
||||||
|
*management*.
|
||||||
|
|
||||||
|
.. TODO(dtantsur): describe the storage network?
|
||||||
|
|
||||||
|
.. TODO(dtantsur): a nice picture to illustrate the layout
|
||||||
|
|
||||||
|
Control plane network
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The *control plane network* is the network where OpenStack control plane
|
||||||
|
services provide their public API.
|
||||||
|
|
||||||
|
The Bare Metal API will be served to the operators and to the Compute service
|
||||||
|
through this network.
|
||||||
|
|
||||||
|
Public network
|
||||||
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The *public network* is used in a typical OpenStack deployment to create
|
||||||
|
floating IPs for outside access to instances. Its role is the same for a bare
|
||||||
|
metal deployment.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Since, as explained below, bare metal nodes will be put on a flat provider
|
||||||
|
network, it is also possible to organize direct access to them, without
|
||||||
|
using floating IPs and bypassing the Networking service completely.
|
||||||
|
|
||||||
|
Bare metal network
|
||||||
|
~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The *Bare metal network* is a dedicated network for bare metal nodes managed by
|
||||||
|
the Bare Metal service.
|
||||||
|
|
||||||
|
This architecture uses :ref:`flat bare metal networking <network-interfaces>`,
|
||||||
|
in which both tenant traffic and technical traffic related to the Bare Metal
|
||||||
|
service operation flow through this one network. Specifically, this network
|
||||||
|
will serve as the *provisioning*, *cleaning* and *rescuing* network. It will
|
||||||
|
also be used for introspection via the Bare Metal Introspection service.
|
||||||
|
See :ref:`common networking considerations <refarch-common-networking>` for
|
||||||
|
an in-depth explanation of the networks used by the Bare Metal service.
|
||||||
|
|
||||||
|
DHCP and boot parameters will be provided on this network by the Networking
|
||||||
|
service's DHCP agents.
|
||||||
|
|
||||||
|
For booting from volumes this network has to have a route to
|
||||||
|
the *storage network*.
|
||||||
|
|
||||||
|
Management network
|
||||||
|
~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
*Management network* is an independent network on which BMCs of the bare
|
||||||
|
metal nodes are located.
|
||||||
|
|
||||||
|
The ``ironic-conductor`` process needs access to this network. The tenants
|
||||||
|
of the bare metal nodes must not have access to it.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
The :ref:`direct deploy interface <direct-deploy>` and certain
|
||||||
|
:doc:`/admin/drivers` require the *management network* to have access
|
||||||
|
to the Object storage service backend.
|
||||||
|
|
||||||
|
Controllers
|
||||||
|
-----------
|
||||||
|
|
||||||
|
A *controller* hosts the OpenStack control plane services as described in the
|
||||||
|
`control plane design guide`_. While this architecture allows using
|
||||||
|
*controllers* in a non-HA configuration, it is recommended to have at least
|
||||||
|
three of them for HA. See :ref:`refarch-common-ha` for more details.
|
||||||
|
|
||||||
|
Bare Metal services
|
||||||
|
~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The following components of the Bare Metal service are installed on a
|
||||||
|
*controller* (see :ref:`components of the Bare Metal service
|
||||||
|
<refarch-common-components>`):
|
||||||
|
|
||||||
|
* The Bare Metal API service either as a WSGI application or the ``ironic-api``
|
||||||
|
process. Typically, a load balancer, such as HAProxy, spreads the load
|
||||||
|
between the API instances on the *controllers*.
|
||||||
|
|
||||||
|
The API has to be served on the *control plane network*. Additionally,
|
||||||
|
it has to be exposed to the *bare metal network* for the ramdisk callback
|
||||||
|
API.
|
||||||
|
|
||||||
|
* The ``ironic-conductor`` process. These processes work in active/active HA
|
||||||
|
mode as explained in :ref:`refarch-common-ha`, thus they can be installed on
|
||||||
|
all *controllers*. Each will handle a subset of bare metal nodes.
|
||||||
|
|
||||||
|
The ``ironic-conductor`` processes have to have access to the following
|
||||||
|
networks:
|
||||||
|
|
||||||
|
* *control plane* for interacting with other services
|
||||||
|
* *management* for contacting node's BMCs
|
||||||
|
* *bare metal* for contacting deployment, cleaning or rescue ramdisks
|
||||||
|
|
||||||
|
* TFTP and HTTP service for booting the nodes. Each ``ironic-conductor``
|
||||||
|
process has to have a matching TFTP and HTTP service. They should be exposed
|
||||||
|
only to the *bare metal network* and must not be behind a load balancer.
|
||||||
|
|
||||||
|
* The ``nova-compute`` process (from the Compute service). These processes work
|
||||||
|
in active/active HA mode when dealing with bare metal nodes, thus they can be
|
||||||
|
installed on all *controllers*. Each will handle a subset of bare metal
|
||||||
|
nodes.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
There is no 1-1 mapping between ``ironic-conductor`` and ``nova-compute``
|
||||||
|
processes, as they communicate only through the Bare Metal API service.
|
||||||
|
|
||||||
|
* The networking-baremetal_ ML2 plugin should be loaded into the Networking
|
||||||
|
service to assist with binding bare metal ports.
|
||||||
|
|
||||||
|
The ironic-neutron-agent_ service should be started as well.
|
||||||
|
|
||||||
|
* If the Bare Metal introspection is used, its ``ironic-inspector`` process
|
||||||
|
has to be installed on all *controllers*. Each such process works as both
|
||||||
|
Bare Metal Introspection API and conductor service. A load balancer should
|
||||||
|
be used to spread the API load between *controllers*.
|
||||||
|
|
||||||
|
The API has to be served on the *control plane network*. Additionally,
|
||||||
|
it has to be exposed to the *bare metal network* for the ramdisk callback
|
||||||
|
API.
|
||||||
|
|
||||||
|
.. TODO(dtantsur): a nice picture to illustrate the above
|
||||||
|
|
||||||
|
Shared services
|
||||||
|
~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
A *controller* also hosts two services required for the normal operation
|
||||||
|
of OpenStack:
|
||||||
|
|
||||||
|
* Database service (MySQL/MariaDB is typically used, but other
|
||||||
|
enterprise-grade database solutions can be used as well).
|
||||||
|
|
||||||
|
All Bare Metal service components need access to the database service.
|
||||||
|
|
||||||
|
* Message queue service (RabbitMQ is typically used, but other
|
||||||
|
enterprise-grade message queue brokers can be used as well).
|
||||||
|
|
||||||
|
Both Bare Metal API (WSGI application or ``ironic-api`` process) and
|
||||||
|
the ``ironic-conductor`` processes need access to the message queue service.
|
||||||
|
The Bare Metal Introspection service does not need it.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
These services are required for all OpenStack services. If you're adding
|
||||||
|
the Bare Metal service to your cloud, you may reuse the existing
|
||||||
|
database and messaging queue services.
|
||||||
|
|
||||||
|
Bare metal nodes
|
||||||
|
----------------
|
||||||
|
|
||||||
|
Each bare metal node must be capable of booting from network, virtual media
|
||||||
|
or other boot technology supported by the Bare Metal service as explained
|
||||||
|
in :ref:`refarch-common-boot`. Each node must have one NIC on the *bare metal
|
||||||
|
network*, and this NIC (and **only** it) must be configured to be able to boot
|
||||||
|
from network. This is usually done in the *BIOS setup* or a similar firmware
|
||||||
|
configuration utility. There is no need to alter the boot order, as it is
|
||||||
|
managed by the Bare Metal service. Other NICs, if present, will not be managed
|
||||||
|
by OpenStack.
|
||||||
|
|
||||||
|
The NIC on the *bare metal network* should have untagged connectivity to it,
|
||||||
|
since PXE firmware usually does not support VLANs - see
|
||||||
|
:ref:`refarch-common-networking` for details.
|
||||||
|
|
||||||
|
Storage
|
||||||
|
-------
|
||||||
|
|
||||||
|
If your hardware **and** its bare metal :doc:`driver </admin/drivers>` support
|
||||||
|
booting from remote volumes, please check the driver documentation for
|
||||||
|
information on how to enable it. It may include routing *management* and/or
|
||||||
|
*bare metal* networks to the *storage network*.
|
||||||
|
|
||||||
|
In case of the standard :ref:`pxe-boot`, booting from remote volumes is done
|
||||||
|
via iPXE. In that case, the Volume storage backend must support iSCSI_
|
||||||
|
protocol, and the *bare metal network* has to have a route to the *storage
|
||||||
|
network*. See :doc:`/admin/boot-from-volume` for more details.
|
||||||
|
|
||||||
|
.. _an OpenStack installation: https://docs.openstack.org/arch-design/use-cases/use-case-general-compute.html
|
||||||
|
.. _Compute service: https://docs.openstack.org/nova/latest/
|
||||||
|
.. _Networking service: https://docs.openstack.org/neutron/latest/
|
||||||
|
.. _Image service: https://docs.openstack.org/glance/latest/
|
||||||
|
.. _Volume service: https://docs.openstack.org/cinder/latest/
|
||||||
|
.. _Bare Metal Introspection service: https://docs.openstack.org/ironic-inspector/latest/
|
||||||
|
.. _control plane design guide: https://docs.openstack.org/arch-design/design-control-plane.html
|
||||||
|
.. _networking-baremetal: https://docs.openstack.org/networking-baremetal/latest/
|
||||||
|
.. _ironic-neutron-agent: https://docs.openstack.org/networking-baremetal/latest/install/index.html#configure-ironic-neutron-agent
|
||||||
|
.. _iSCSI: https://en.wikipedia.org/wiki/ISCSI
|
Loading…
Reference in New Issue
Block a user