diff --git a/doc/source/admin/multitenancy.rst b/doc/source/admin/multitenancy.rst index bd207df5e5..5e95ab62e8 100644 --- a/doc/source/admin/multitenancy.rst +++ b/doc/source/admin/multitenancy.rst @@ -17,6 +17,8 @@ not support trunk ports belonging to multiple networks. Concepts ======== +.. _network-interfaces: + Network interfaces ------------------ diff --git a/doc/source/admin/troubleshooting.rst b/doc/source/admin/troubleshooting.rst index 0eab739f1c..e06959c002 100644 --- a/doc/source/admin/troubleshooting.rst +++ b/doc/source/admin/troubleshooting.rst @@ -328,6 +328,8 @@ distribution that uses ``systemd``: ip_addr iptables +.. _troubleshooting-stp: + DHCP during PXE or iPXE is inconsistent or unreliable ===================================================== diff --git a/doc/source/install/refarch/common.rst b/doc/source/install/refarch/common.rst index 7a9d2c56fb..a1024f02cc 100644 --- a/doc/source/install/refarch/common.rst +++ b/doc/source/install/refarch/common.rst @@ -7,6 +7,8 @@ architectures. .. contents:: :local: +.. _refarch-common-components: + Components ---------- @@ -46,7 +48,7 @@ components. .. warning:: The ``ironic-python-agent`` service is not intended to be used or executed - anywhere other than a provisioning/cleaning ramdisk. + anywhere other than a provisioning/cleaning/rescue ramdisk. Hardware and drivers -------------------- @@ -82,6 +84,8 @@ also supports :doc:`/admin/drivers/redfish`. Several vendors provide more specific drivers that usually provide additional capabilities. Check :doc:`/admin/drivers` to find the most suitable one. +.. _refarch-common-boot: + Boot interface ~~~~~~~~~~~~~~ @@ -182,6 +186,8 @@ node. See :ref:`local-boot-partition-images` for details. Currently, network boot is used by default. However, we plan on changing it in the future, so it's safer to set the ``default_boot_option`` explicitly. +.. _refarch-common-networking: + Networking ---------- @@ -194,13 +200,20 @@ documentation. However, several considerations are common for all of them: not be accessible by end users, and should not have access to the internet. * There has to be a *cleaning* network, which is used by nodes during - the cleaning process. In the majority of cases, the same network should - be used for cleaning and provisioning for simplicity. + the cleaning process. -Unless noted otherwise, everything in these sections apply to both networks. +* There should be a *rescuing* network, which is used by nodes during + the rescue process. It can be skipped if the rescue process is not supported. + +.. note:: + In the majority of cases, the same network should be used for cleaning, + provisioning and rescue for simplicity. + +Unless noted otherwise, everything in these sections apply to all three +networks. * The baremetal nodes must have access to the Bare Metal API while connected - to the provisioning/cleaning network. + to the provisioning/cleaning/rescuing network. .. note:: Only two endpoints need to be exposed there:: @@ -213,25 +226,32 @@ Unless noted otherwise, everything in these sections apply to both networks. * If the ``pxe`` boot interface (or any boot interface based on it) is used, then the baremetal nodes should have untagged (access mode) connectivity - to the provisioning/cleaning networks. It allows PXE firmware, which does not - support VLANs, to communicate with the services required for provisioning. + to the provisioning/cleaning/rescuing networks. It allows PXE firmware, which + does not support VLANs, to communicate with the services required + for provisioning. .. note:: It depends on the *network interface* whether the Bare Metal service will handle it automatically. Check the networking documentation for the specific architecture. + Sometimes it may be necessary to disable the spanning tree protocol delay on + the switch - see :ref:`troubleshooting-stp`. + * The Baremetal nodes need to have access to any services required for - provisioning/cleaning, while connected to the provisioning/cleaning network. - This may include: + provisioning/cleaning/rescue, while connected to the + provisioning/cleaning/rescuing network. This may include: * a TFTP server for PXE boot and also an HTTP server when iPXE is enabled * either an HTTP server or the Object Storage service in case of the ``direct`` deploy interface and some virtual media boot interfaces * The Baremetal Conductors need to have access to the booted baremetal nodes - during provisioning/cleaning. A conductor communicates with an internal - API, provided by **ironic-python-agent**, to conduct actions on nodes. + during provisioning/cleaning/rescue. A conductor communicates with + an internal API, provided by **ironic-python-agent**, to conduct actions + on nodes. + +.. _refarch-common-ha: HA and Scalability ------------------ diff --git a/doc/source/install/refarch/index.rst b/doc/source/install/refarch/index.rst index 88d8d728e1..6729006b56 100644 --- a/doc/source/install/refarch/index.rst +++ b/doc/source/install/refarch/index.rst @@ -10,3 +10,11 @@ to get better familiar with the concepts used in this guide. :maxdepth: 2 common + +Scenarios +--------- + +.. toctree:: + :maxdepth: 2 + + small-cloud-trusted-tenants diff --git a/doc/source/install/refarch/small-cloud-trusted-tenants.rst b/doc/source/install/refarch/small-cloud-trusted-tenants.rst new file mode 100644 index 0000000000..a17e165831 --- /dev/null +++ b/doc/source/install/refarch/small-cloud-trusted-tenants.rst @@ -0,0 +1,248 @@ +Small cloud with trusted tenants +================================ + +Story +----- + +As an operator I would like to build a small cloud with both virtual and bare +metal instances or add bare metal provisioning to my existing small or medium +scale single-site OpenStack cloud. The expected number of bare metal machines +is less than 100, and the rate of provisioning and unprovisioning is expected +to be low. All users of my cloud are trusted by me to not conduct malicious +actions towards each other or the cloud infrastructure itself. + +As a user I would like to occasionally provision bare metal instances through +the Compute API by selecting an appropriate Compute flavor. I would like +to be able to boot them from images provided by the Image service or from +volumes provided by the Volume service. + +Components +---------- + +This architecture assumes `an OpenStack installation`_ with the following +components participating in the bare metal provisioning: + +* The `Compute service`_ manages bare metal instances. + +* The `Networking service`_ provides DHCP for bare metal instances. + +* The `Image service`_ provides images for bare metal instances. + +The following services can be optionally used by the Bare Metal service: + +* The `Volume service`_ provides volumes to boot bare metal instances from. + +* The `Bare Metal Introspection service`_ simplifies enrolling new bare metal + machines by conducting in-band introspection. + +Node roles +---------- + +An OpenStack installation in this guide has at least these three types of +nodes: + +* A *controller* node hosts the control plane services. + +* A *compute* node runs the virtual machines and hosts a subset of Compute + and Networking components. + +* A *block storage* node provides persistent storage space for both virtual + and bare metal nodes. + +The *compute* and *block storage* nodes are configured as described in the +installation guides of the `Compute service`_ and the `Volume service`_ +respectively. The *controller* nodes host the Bare Metal service components. + +Networking +---------- + +The networking architecture will highly depend on the exact operating +requirements. This guide expects the following existing networks: +*control plane*, *storage* and *public*. Additionally, two more networks +will be needed specifically for bare metal provisioning: *bare metal* and +*management*. + +.. TODO(dtantsur): describe the storage network? + +.. TODO(dtantsur): a nice picture to illustrate the layout + +Control plane network +~~~~~~~~~~~~~~~~~~~~~ + +The *control plane network* is the network where OpenStack control plane +services provide their public API. + +The Bare Metal API will be served to the operators and to the Compute service +through this network. + +Public network +~~~~~~~~~~~~~~ + +The *public network* is used in a typical OpenStack deployment to create +floating IPs for outside access to instances. Its role is the same for a bare +metal deployment. + +.. note:: + Since, as explained below, bare metal nodes will be put on a flat provider + network, it is also possible to organize direct access to them, without + using floating IPs and bypassing the Networking service completely. + +Bare metal network +~~~~~~~~~~~~~~~~~~ + +The *Bare metal network* is a dedicated network for bare metal nodes managed by +the Bare Metal service. + +This architecture uses :ref:`flat bare metal networking `, +in which both tenant traffic and technical traffic related to the Bare Metal +service operation flow through this one network. Specifically, this network +will serve as the *provisioning*, *cleaning* and *rescuing* network. It will +also be used for introspection via the Bare Metal Introspection service. +See :ref:`common networking considerations ` for +an in-depth explanation of the networks used by the Bare Metal service. + +DHCP and boot parameters will be provided on this network by the Networking +service's DHCP agents. + +For booting from volumes this network has to have a route to +the *storage network*. + +Management network +~~~~~~~~~~~~~~~~~~ + +*Management network* is an independent network on which BMCs of the bare +metal nodes are located. + +The ``ironic-conductor`` process needs access to this network. The tenants +of the bare metal nodes must not have access to it. + +.. note:: + The :ref:`direct deploy interface ` and certain + :doc:`/admin/drivers` require the *management network* to have access + to the Object storage service backend. + +Controllers +----------- + +A *controller* hosts the OpenStack control plane services as described in the +`control plane design guide`_. While this architecture allows using +*controllers* in a non-HA configuration, it is recommended to have at least +three of them for HA. See :ref:`refarch-common-ha` for more details. + +Bare Metal services +~~~~~~~~~~~~~~~~~~~ + +The following components of the Bare Metal service are installed on a +*controller* (see :ref:`components of the Bare Metal service +`): + +* The Bare Metal API service either as a WSGI application or the ``ironic-api`` + process. Typically, a load balancer, such as HAProxy, spreads the load + between the API instances on the *controllers*. + + The API has to be served on the *control plane network*. Additionally, + it has to be exposed to the *bare metal network* for the ramdisk callback + API. + +* The ``ironic-conductor`` process. These processes work in active/active HA + mode as explained in :ref:`refarch-common-ha`, thus they can be installed on + all *controllers*. Each will handle a subset of bare metal nodes. + + The ``ironic-conductor`` processes have to have access to the following + networks: + + * *control plane* for interacting with other services + * *management* for contacting node's BMCs + * *bare metal* for contacting deployment, cleaning or rescue ramdisks + +* TFTP and HTTP service for booting the nodes. Each ``ironic-conductor`` + process has to have a matching TFTP and HTTP service. They should be exposed + only to the *bare metal network* and must not be behind a load balancer. + +* The ``nova-compute`` process (from the Compute service). These processes work + in active/active HA mode when dealing with bare metal nodes, thus they can be + installed on all *controllers*. Each will handle a subset of bare metal + nodes. + + .. note:: + There is no 1-1 mapping between ``ironic-conductor`` and ``nova-compute`` + processes, as they communicate only through the Bare Metal API service. + +* The networking-baremetal_ ML2 plugin should be loaded into the Networking + service to assist with binding bare metal ports. + + The ironic-neutron-agent_ service should be started as well. + +* If the Bare Metal introspection is used, its ``ironic-inspector`` process + has to be installed on all *controllers*. Each such process works as both + Bare Metal Introspection API and conductor service. A load balancer should + be used to spread the API load between *controllers*. + + The API has to be served on the *control plane network*. Additionally, + it has to be exposed to the *bare metal network* for the ramdisk callback + API. + +.. TODO(dtantsur): a nice picture to illustrate the above + +Shared services +~~~~~~~~~~~~~~~ + +A *controller* also hosts two services required for the normal operation +of OpenStack: + +* Database service (MySQL/MariaDB is typically used, but other + enterprise-grade database solutions can be used as well). + + All Bare Metal service components need access to the database service. + +* Message queue service (RabbitMQ is typically used, but other + enterprise-grade message queue brokers can be used as well). + + Both Bare Metal API (WSGI application or ``ironic-api`` process) and + the ``ironic-conductor`` processes need access to the message queue service. + The Bare Metal Introspection service does not need it. + +.. note:: + These services are required for all OpenStack services. If you're adding + the Bare Metal service to your cloud, you may reuse the existing + database and messaging queue services. + +Bare metal nodes +---------------- + +Each bare metal node must be capable of booting from network, virtual media +or other boot technology supported by the Bare Metal service as explained +in :ref:`refarch-common-boot`. Each node must have one NIC on the *bare metal +network*, and this NIC (and **only** it) must be configured to be able to boot +from network. This is usually done in the *BIOS setup* or a similar firmware +configuration utility. There is no need to alter the boot order, as it is +managed by the Bare Metal service. Other NICs, if present, will not be managed +by OpenStack. + +The NIC on the *bare metal network* should have untagged connectivity to it, +since PXE firmware usually does not support VLANs - see +:ref:`refarch-common-networking` for details. + +Storage +------- + +If your hardware **and** its bare metal :doc:`driver ` support +booting from remote volumes, please check the driver documentation for +information on how to enable it. It may include routing *management* and/or +*bare metal* networks to the *storage network*. + +In case of the standard :ref:`pxe-boot`, booting from remote volumes is done +via iPXE. In that case, the Volume storage backend must support iSCSI_ +protocol, and the *bare metal network* has to have a route to the *storage +network*. See :doc:`/admin/boot-from-volume` for more details. + +.. _an OpenStack installation: https://docs.openstack.org/arch-design/use-cases/use-case-general-compute.html +.. _Compute service: https://docs.openstack.org/nova/latest/ +.. _Networking service: https://docs.openstack.org/neutron/latest/ +.. _Image service: https://docs.openstack.org/glance/latest/ +.. _Volume service: https://docs.openstack.org/cinder/latest/ +.. _Bare Metal Introspection service: https://docs.openstack.org/ironic-inspector/latest/ +.. _control plane design guide: https://docs.openstack.org/arch-design/design-control-plane.html +.. _networking-baremetal: https://docs.openstack.org/networking-baremetal/latest/ +.. _ironic-neutron-agent: https://docs.openstack.org/networking-baremetal/latest/install/index.html#configure-ironic-neutron-agent +.. _iSCSI: https://en.wikipedia.org/wiki/ISCSI