FUP Networking guide: Guaranteed Minimum Bandwidth
Being so close to the rc1 deadline I didn't want to lose +2s of https://review.openstack.org/640390, so review comments are fixed here. Change-Id: Ic3353dd34bbc9fe6c1e03395d095067238fac844 Related-Bug: #1578989 Related-Change: https://review.openstack.org/640390 See-Also: https://review.openstack.org/502306 (nova spec) See-Also: https://review.openstack.org/508149 (neutron spec)
This commit is contained in:
parent
252598b088
commit
7cccaf0616
@ -1,5 +1,5 @@
|
|||||||
Quality of Service (QoS): Guaranteed Bandwidth
|
Quality of Service (QoS): Guaranteed Minimum Bandwidth
|
||||||
==============================================
|
======================================================
|
||||||
|
|
||||||
Most Networking Quality of Service (QoS) features are implemented solely
|
Most Networking Quality of Service (QoS) features are implemented solely
|
||||||
by OpenStack Neutron and they are already documented in the :doc:`QoS
|
by OpenStack Neutron and they are already documented in the :doc:`QoS
|
||||||
@ -17,8 +17,8 @@ use the Guaranteed Minimum Bandwidth feature.
|
|||||||
A guarantee of minimum available bandwidth can be enforced on two levels:
|
A guarantee of minimum available bandwidth can be enforced on two levels:
|
||||||
|
|
||||||
* Scheduling a server on a compute host where the bandwidth is available.
|
* Scheduling a server on a compute host where the bandwidth is available.
|
||||||
To be more precise: scheduling a server's port on a compute host's
|
To be more precise: scheduling one or more ports of a server on a compute
|
||||||
physical network interface where the bandwidth is available.
|
host's physical network interfaces where the bandwidth is available.
|
||||||
* Queueing network packets on a physical network interface to provide the
|
* Queueing network packets on a physical network interface to provide the
|
||||||
guaranteed bandwidth.
|
guaranteed bandwidth.
|
||||||
|
|
||||||
@ -34,14 +34,14 @@ here we only document the placement-level enforcement.
|
|||||||
Limitations
|
Limitations
|
||||||
-----------
|
-----------
|
||||||
|
|
||||||
* A port with a ``minimum-bandwidth`` rule must be passed when booting a
|
* A pre-created port with a ``minimum-bandwidth`` rule must be passed
|
||||||
server (``openstack server create``). Passing a network with a
|
when booting a server (``openstack server create``). Passing a network
|
||||||
minimum-bandwidth rule at boot is rejected because of technical
|
with a minimum-bandwidth rule at boot is not supported because of
|
||||||
reasons (in this case the port is created too late for Neutron to
|
technical reasons (in this case the port is created too late for
|
||||||
affect scheduling).
|
Neutron to affect scheduling).
|
||||||
|
|
||||||
* Bandwidth guarantees for ports can only be requested on networks
|
* Bandwidth guarantees for ports can only be requested on networks
|
||||||
backed by physnets.
|
backed by a physical network (physnet).
|
||||||
|
|
||||||
* In Stein there is no support for networks with multiple physnets.
|
* In Stein there is no support for networks with multiple physnets.
|
||||||
However some simpler multi-segment networks are still supported:
|
However some simpler multi-segment networks are still supported:
|
||||||
@ -87,15 +87,15 @@ Limitations
|
|||||||
<https://developer.openstack.org/api-guide/compute/port_with_resource_request.html>`_
|
<https://developer.openstack.org/api-guide/compute/port_with_resource_request.html>`_
|
||||||
of the OpenStack Compute API Guide.
|
of the OpenStack Compute API Guide.
|
||||||
|
|
||||||
* If an SR-IOV physical function is configured for use by neutron-ovs-agent,
|
* If an SR-IOV physical function is configured for use by the
|
||||||
and the same physical function's virtual functions are configured
|
neutron-openvswitch-agent, and the same physical function's virtual
|
||||||
for use by the neutron-sriov-agent then the available bandwidth must
|
functions are configured for use by the neutron-sriov-agent then the
|
||||||
be statically split between the corresponding resource providers by
|
available bandwidth must be statically split between the corresponding
|
||||||
administrative choice. For example a 10g SR-IOV capable physical NIC
|
resource providers by administrative choice. For example a 10 Gbps
|
||||||
could be treated as two independent NICs - a 5g NIC (technically the
|
SR-IOV capable physical NIC could be treated as two independent NICs -
|
||||||
physical function of the NIC) added to an Open vSwitch bridge, and
|
a 5 Gbps NIC (technically the physical function of the NIC) added to
|
||||||
another 5g NIC whose virtual functions can be handed out to servers
|
an Open vSwitch bridge, and another 5 Gbps NIC whose virtual functions
|
||||||
by neutron-sriov-agent.
|
can be handed out to servers by neutron-sriov-agent.
|
||||||
|
|
||||||
Placement pre-requisites
|
Placement pre-requisites
|
||||||
------------------------
|
------------------------
|
||||||
@ -144,27 +144,48 @@ provider information from neutron-server to Placement.
|
|||||||
Since neutron-server talks to Placement you need to configure how
|
Since neutron-server talks to Placement you need to configure how
|
||||||
neutron-server should find Placement and authenticate to it.
|
neutron-server should find Placement and authenticate to it.
|
||||||
|
|
||||||
``neutron.conf``:
|
``/etc/neutron/neutron.conf`` (on controller nodes):
|
||||||
|
|
||||||
.. code-block:: ini
|
.. code-block:: ini
|
||||||
|
|
||||||
[DEFAULT]
|
[DEFAULT]
|
||||||
service_plugins = placement,...
|
service_plugins = placement,...
|
||||||
|
auth_strategy = keystone
|
||||||
|
|
||||||
[placement]
|
[placement]
|
||||||
auth_type = password
|
auth_type = password
|
||||||
auth_url = https://keystone.address/identity
|
auth_url = https://controller/identity
|
||||||
password = secret
|
password = secret
|
||||||
project_domain_name = Default
|
project_domain_name = Default
|
||||||
project_name = service
|
project_name = service
|
||||||
user_domain_name = Default
|
user_domain_name = Default
|
||||||
username = placement
|
username = placement
|
||||||
|
|
||||||
neutron-ovs-agent config
|
If a vnic_type is supported by default by multiple ML2 mechanism
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
drivers (e.g. ``vnic_type=direct`` by both ``openvswitch`` and
|
||||||
|
``sriovnicswitch``) and multiple agents' resources are also meant to be
|
||||||
|
tracked by Placement, then the admin must decide which driver to take
|
||||||
|
ports of that vnic_type by blacklisting the vnic_type for the unwanted
|
||||||
|
drivers. Use :oslo.config:option:`ovs_driver.vnic_type_blacklist` in this
|
||||||
|
case. Valid values are all the ``supported_vnic_types`` of the
|
||||||
|
`respective mechanism drivers
|
||||||
|
<https://docs.openstack.org/neutron/latest/admin/config-ml2.html#supported-vnic-types>`_.
|
||||||
|
|
||||||
Set agent configuration as the authentic source of the resources
|
``/etc/neutron/plugins/ml2/ml2_conf.ini`` (on controller nodes):
|
||||||
available. Set it on a per-bridge basis by
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
[ovs_driver]
|
||||||
|
vnic_type_blacklist = direct
|
||||||
|
|
||||||
|
[sriov_driver]
|
||||||
|
#vnic_type_blacklist = direct
|
||||||
|
|
||||||
|
neutron-openvswitch-agent config
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Set the agent configuration as the authentic source of
|
||||||
|
the resources available. Set it on a per-bridge basis by
|
||||||
:oslo.config:option:`ovs.resource_provider_bandwidths`.
|
:oslo.config:option:`ovs.resource_provider_bandwidths`.
|
||||||
The format is: ``bridge:egress:ingress,...``
|
The format is: ``bridge:egress:ingress,...``
|
||||||
You may set only one direction and omit the other.
|
You may set only one direction and omit the other.
|
||||||
@ -177,23 +198,13 @@ You may set only one direction and omit the other.
|
|||||||
Egress and ingress available bandwidth values are in ``kilobit/sec (kbps)``.
|
Egress and ingress available bandwidth values are in ``kilobit/sec (kbps)``.
|
||||||
|
|
||||||
If desired, resource provider inventory fields can be tweaked on a
|
If desired, resource provider inventory fields can be tweaked on a
|
||||||
per agent basis by setting
|
per-agent basis by setting
|
||||||
:oslo.config:option:`ovs.resource_provider_inventory_defaults`.
|
:oslo.config:option:`ovs.resource_provider_inventory_defaults`.
|
||||||
Valid values are all the
|
Valid values are all the
|
||||||
`optional parameters of the update resource provider inventory call
|
`optional parameters of the update resource provider inventory call
|
||||||
<https://developer.openstack.org/api-ref/placement/?expanded=update-resource-provider-inventory-detail#update-resource-provider-inventory>`_.
|
<https://developer.openstack.org/api-ref/placement/?expanded=update-resource-provider-inventory-detail#update-resource-provider-inventory>`_.
|
||||||
|
|
||||||
If a vnic_type is supported by default by multiple ML2 mechanism
|
``/etc/neutron/plugins/ml2/ovs_agent.ini`` (on compute and network nodes):
|
||||||
drivers (e.g. ``vnic_type=direct`` by both ``openvswitch`` and
|
|
||||||
``sriovnicswitch``) and multiple agents' resources are also meant to be
|
|
||||||
tracked by Placement, then the admin must decide which driver to take
|
|
||||||
ports of that vnic_type by blacklisting the vnic_type for the unwanted
|
|
||||||
drivers. Use :oslo.config:option:`ovs_driver.vnic_type_blacklist` in this
|
|
||||||
case. Valid values are all the ``supported_vnic_types`` of the
|
|
||||||
`respective mechanism drivers
|
|
||||||
<https://docs.openstack.org/neutron/latest/admin/config-ml2.html#supported-vnic-types>`_.
|
|
||||||
|
|
||||||
``ovs_agent.ini`` or ``ml2_conf.ini``:
|
|
||||||
|
|
||||||
.. code-block:: ini
|
.. code-block:: ini
|
||||||
|
|
||||||
@ -202,14 +213,11 @@ case. Valid values are all the ``supported_vnic_types`` of the
|
|||||||
resource_provider_bandwidths = br-physnet0:10000000:10000000,...
|
resource_provider_bandwidths = br-physnet0:10000000:10000000,...
|
||||||
#resource_provider_inventory_defaults = step_size:1000,...
|
#resource_provider_inventory_defaults = step_size:1000,...
|
||||||
|
|
||||||
[ovs_driver]
|
|
||||||
vnic_type_blacklist = direct
|
|
||||||
|
|
||||||
neutron-sriov-agent config
|
neutron-sriov-agent config
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The configuration of neutron-sriov-agent is analog to that of
|
The configuration of neutron-sriov-agent is analog to that of
|
||||||
neutron-ovs-agent. However look out for:
|
neutron-openvswitch-agent. However look out for:
|
||||||
|
|
||||||
* The different .ini section names as you can see below.
|
* The different .ini section names as you can see below.
|
||||||
* That neutron-sriov-agent allows a physnet to be backed by multiple physical
|
* That neutron-sriov-agent allows a physnet to be backed by multiple physical
|
||||||
@ -217,7 +225,7 @@ neutron-ovs-agent. However look out for:
|
|||||||
* Of course refer to SR-IOV physical functions instead of bridges in
|
* Of course refer to SR-IOV physical functions instead of bridges in
|
||||||
:oslo.config:option:`sriov_nic.resource_provider_bandwidths`.
|
:oslo.config:option:`sriov_nic.resource_provider_bandwidths`.
|
||||||
|
|
||||||
``sriov_agent.ini``:
|
``/etc/neutron/plugins/ml2/sriov_agent.ini`` (on compute nodes):
|
||||||
|
|
||||||
.. code-block:: ini
|
.. code-block:: ini
|
||||||
|
|
||||||
@ -226,9 +234,6 @@ neutron-ovs-agent. However look out for:
|
|||||||
resource_provider_bandwidths = ens5:40000000:40000000,ens6:40000000:40000000,...
|
resource_provider_bandwidths = ens5:40000000:40000000,ens6:40000000:40000000,...
|
||||||
#resource_provider_inventory_defaults = step_size:1000,...
|
#resource_provider_inventory_defaults = step_size:1000,...
|
||||||
|
|
||||||
[sriov_driver]
|
|
||||||
#vnic_type_blacklist = direct
|
|
||||||
|
|
||||||
Propagation of resource information
|
Propagation of resource information
|
||||||
-----------------------------------
|
-----------------------------------
|
||||||
|
|
||||||
@ -237,7 +242,7 @@ The flow of information is different for available and used resources.
|
|||||||
The authentic source of available resources is neutron agent configuration -
|
The authentic source of available resources is neutron agent configuration -
|
||||||
where the resources actually exist, as described in the agent configuration
|
where the resources actually exist, as described in the agent configuration
|
||||||
sections above. This information is propagated in the following chain:
|
sections above. This information is propagated in the following chain:
|
||||||
``neutron-x-agent -> neutron-server -> Placement``.
|
``neutron-l2-agent -> neutron-server -> Placement``.
|
||||||
|
|
||||||
From neutron agent to server the information is included in the
|
From neutron agent to server the information is included in the
|
||||||
``configurations`` field of the agent heartbeat message sent on the message
|
``configurations`` field of the agent heartbeat message sent on the message
|
||||||
@ -264,6 +269,7 @@ queue periodically.
|
|||||||
'reserved': 0,
|
'reserved': 0,
|
||||||
'step_size': 1},
|
'step_size': 1},
|
||||||
...
|
...
|
||||||
|
}
|
||||||
|
|
||||||
Re-reading the resource related subset of configuration on ``SIGHUP`` is not
|
Re-reading the resource related subset of configuration on ``SIGHUP`` is not
|
||||||
implemented. The agent must be restarted to pick up and send changed
|
implemented. The agent must be restarted to pick up and send changed
|
||||||
@ -310,7 +316,7 @@ than Neutron).
|
|||||||
It is important to note that the restart of neutron-server does not trigger
|
It is important to note that the restart of neutron-server does not trigger
|
||||||
any kind of re-sync to Placement (to avoid an update storm).
|
any kind of re-sync to Placement (to avoid an update storm).
|
||||||
|
|
||||||
As mentioned before the information flow for resources requested and
|
As mentioned before, the information flow for resources requested and
|
||||||
(if proper) allocated is different. It involves a conversation between Nova,
|
(if proper) allocated is different. It involves a conversation between Nova,
|
||||||
Neutron and Placement.
|
Neutron and Placement.
|
||||||
|
|
||||||
@ -375,7 +381,7 @@ servers with those ports:
|
|||||||
--network net0 \
|
--network net0 \
|
||||||
--qos-policy policy0
|
--qos-policy policy0
|
||||||
|
|
||||||
# alternatively an sr-iov port, unused in this example
|
# alternatively an SR-IOV port, unused in this example
|
||||||
$ openstack port create port-direct-qos \
|
$ openstack port create port-direct-qos \
|
||||||
--network net0 \
|
--network net0 \
|
||||||
--vnic-type direct \
|
--vnic-type direct \
|
||||||
@ -393,37 +399,38 @@ Since Placement carries a global view of a cloud deployment's resources
|
|||||||
(what is available, what is used) it may in some conditions get out of sync
|
(what is available, what is used) it may in some conditions get out of sync
|
||||||
with reality.
|
with reality.
|
||||||
|
|
||||||
One important case is when the Neutron-only Minimum Guaranteed Bandwidth
|
One important case is when the data-plane-only Minimum Guaranteed Bandwidth
|
||||||
feature was used before Stein (first released in Newton). Since before Stein
|
feature was used before Stein (first released in Newton). Since before Stein
|
||||||
guarantees were not enforced during server placement the available resources
|
guarantees were not enforced during server placement the available resources
|
||||||
may have become overallocated without notice. In this case Placement's view
|
may have become overallocated without notice. In this case Placement's view
|
||||||
and the reality of resource use should be made consistent during/after an
|
and the reality of resource usage should be made consistent during/after an
|
||||||
upgrade to Stein.
|
upgrade to Stein.
|
||||||
|
|
||||||
Another case stems from OpenStack not having distributed transactions to
|
Another case stems from OpenStack not having distributed transactions to
|
||||||
allocate resources provided by multiple OpenStack components (here Nova and
|
allocate resources provided by multiple OpenStack components (here Nova and
|
||||||
Neutron). There are known race conditions in which Placement's view may get
|
Neutron). There are known race conditions in which Placement's view may get
|
||||||
out of touch with reality. The design knowingly minimizes the race condition
|
out of sync with reality. The design knowingly minimizes the race condition
|
||||||
windows though. Known problems:
|
windows, but there are known problems:
|
||||||
|
|
||||||
* A QoS policy is modified after Nova read a port's ``resource_request``
|
* If a QoS policy is modified after Nova read a port's ``resource_request``
|
||||||
but before the port is bound.
|
but before the port is bound its state before the modification will be
|
||||||
* `A bound port with a resource allocation is deleted. The port's allocation
|
applied.
|
||||||
is leaked. <https://bugs.launchpad.net/nova/+bug/1820588>`_
|
* If a bound port with a resource allocation is deleted. The port's allocation
|
||||||
|
is leaked. `<https://bugs.launchpad.net/nova/+bug/1820588>`_
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
Deleting a bound port has no known use case though. Please consider
|
Deleting a bound port has no known use case. Please consider detaching
|
||||||
detaching the interface first by ``openstack server remove port`` instead.
|
the interface first by ``openstack server remove port`` instead.
|
||||||
|
|
||||||
Incorrect allocations may be fixed by:
|
Incorrect allocations may be fixed by:
|
||||||
|
|
||||||
* Moving a server deletes the wrong and re-creates the proper allocation
|
* Moving the server, which will delete the wrong allocation and create the
|
||||||
as soon as move operations are implemented (not in Stein unfortunately).
|
correct allocation as soon as move operations are implemented (not in Stein
|
||||||
Moving servers fixes local overallocations.
|
unfortunately). Moving servers fixes local overallocations.
|
||||||
* The need for an upgrade-helper allocation healing tool is being tracked in
|
* The need for an upgrade-helper allocation healing tool is being tracked in
|
||||||
`bug/1819923 <https://bugs.launchpad.net/nova/+bug/1819923>`_.
|
`bug 1819923 <https://bugs.launchpad.net/nova/+bug/1819923>`_.
|
||||||
* Manually by `openstack resource provider allocation set
|
* Manually, by using `openstack resource provider allocation set
|
||||||
<https://docs.openstack.org/osc-placement/latest/cli/index.html#resource-provider-allocation-set>`_
|
<https://docs.openstack.org/osc-placement/latest/cli/index.html#resource-provider-allocation-set>`_
|
||||||
/`delete <https://docs.openstack.org/osc-placement/latest/cli/index.html#resource-provider-allocation-delete>`_.
|
/`delete <https://docs.openstack.org/osc-placement/latest/cli/index.html#resource-provider-allocation-delete>`_.
|
||||||
|
|
||||||
@ -449,6 +456,8 @@ Debugging
|
|||||||
# as admin
|
# as admin
|
||||||
$ openstack network agent show ... | grep configurations
|
$ openstack network agent show ... | grep configurations
|
||||||
|
|
||||||
|
Please find an example in section `Propagation of resource information`_.
|
||||||
|
|
||||||
* Did neutron-server successfully sync to Placement?
|
* Did neutron-server successfully sync to Placement?
|
||||||
|
|
||||||
.. code-block:: console
|
.. code-block:: console
|
||||||
@ -456,6 +465,8 @@ Debugging
|
|||||||
# as admin
|
# as admin
|
||||||
$ openstack network agent show ... | grep resources_synced
|
$ openstack network agent show ... | grep resources_synced
|
||||||
|
|
||||||
|
Please find an example in section `Propagation of resource information`_.
|
||||||
|
|
||||||
* Is the resource provider tree correct? Is the root a compute host? One level
|
* Is the resource provider tree correct? Is the root a compute host? One level
|
||||||
below the agents? Two levels below the physical network interfaces?
|
below the agents? Two levels below the physical network interfaces?
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user