Like other containers.
This ensures that upgrade already updates PXE components and no
additional deploy/reconfigure is needed.
Closes-Bug: #1963752
Change-Id: I368780143086bc5baab1556a5ec75c19950d5e3c
This commit adds support for pushing Ceilometer metrics
to Prometheus instead of Gnocchi or alongside it.
Closes-Bug: #1964135
Signed-off-by: Juan Pablo Suazo <jsuazo@whitestack.com>
Change-Id: I9fd32f63913a534c59e2d17703702074eea5dd76
Change Ia1239069ccee39416b20959cbabad962c56693cf added support for
running a libvirt daemon on the host, rather than using the nova_libvirt
container. It did not cover migration of existing hosts from using a
container to using a host daemon.
This change adds a kolla-ansible nova-libvirt-cleanup command which may
be used to clean up the nova_libvirt container, volumes and related
items on hosts, once it has been disabled.
The playbook assumes that compute hosts have been emptied of VMs before
it runs. A future extension could support migration of existing VMs, but
this is currently out of scope.
Change-Id: I46854ed7eaf1d5b5e3ccd8531c963427848bdc99
In some cases it may be desirable to run the libvirt daemon on the host.
For example, when mixing host and container OS distributions or
versions.
This change makes it possible to disable the nova_libvirt container, by
setting enable_nova_libvirt_container to false. The default values of
some Docker mounts and other paths have been updated to point to default
host directories rather than Docker volumes when using a host libvirt
daemon.
This change does not handle migration of existing systems from using
a nova_libvirt container to libvirt on the host.
Depends-On: https://review.opendev.org/c/openstack/ansible-collection-kolla/+/830504
Change-Id: Ia1239069ccee39416b20959cbabad962c56693cf
Consistently use template instead of copy. This has the added
advantage of allowing variables inside ceph conf files and keyrings.
Closes-Bug: 1959565
Signed-off-by: Imran Hussain <ih@imranh.co.uk>
Change-Id: Ibd0ff2641a54267ff06d3c89a26915a455dff1c1
This project [1] can provide a one-stop solution to log collection,
cleaning, indexing, analysis, alarm, visualization, report generation
and other needs, which involves helping operator or maintainer to
quickly solve retrieve problems, grasp the operational health of the
platform, and improve the level of platform management.
[1] https://wiki.openstack.org/wiki/Venus
Change-Id: If3562bbed6181002b76831bab54f863041c5a885
In Kolla Ansible OpenStack deployments, by default, libvirt is
configured to allow read-write access via an unauthenticated,
unencrypted TCP connection, using the internal API network. This is to
facilitate migration between hosts.
By default, Kolla Ansible does not use encryption for services on the
internal network (and did not support it until Ussuri). However, most
other services on the internal network are at least authenticated
(usually via passwords), ensuring that they cannot be used by anyone
with access to the network, unless they have credentials.
The main issue here is the lack of authentication. Any client with
access to the internal network is able to connect to the libvirt TCP
port and make arbitrary changes to the hypervisor. This could include
starting a VM, modifying an existing VM, etc. Given the flexibility of
the domain options, it could be seen as equivalent to having root access
to the hypervisor.
Kolla Ansible supports libvirt TLS [1] since the Train release, using
client and server certificates for mutual authentication and encryption.
However, this feature is not enabled by default, and requires
certificates to be generated for each compute host.
This change adds support for libvirt SASL authentication, and enables it
by default. This provides base level of security. Deployments requiring
further security should use libvirt TLS.
[1] https://docs.openstack.org/kolla-ansible/latest/reference/compute/libvirt-guide.html#libvirt-tls
Depends-On: https://review.opendev.org/c/openstack/kolla/+/833021
Closes-Bug: #1964013
Change-Id: Ia91ceeb609e4cdb144433122b443028c0278b71e
Add "enable_prometheus_etcd_integration" configuration parameter which
can be used to configure Prometheus to scrape etcd metrics endpoints.
The default value of "enable_prometheus_etcd_integration" is set to
the combined values of "enable_prometheus" and "enable_etcd".
Change-Id: I7a0b802c5687e2d508e06baf55e355d9761e806f
While I8bb398e299aa68147004723a18d3a1ec459011e5 stopped setting
the net.ipv4.ip_forward sysctl, this change explicitly removes the
option from the Kolla sysctl config file. In the absence of another
source for this sysctl, it should revert to the default of 0 after the
next reboot.
A deployer looking to more aggressively change the value may set
neutron_l3_agent_host_ipv4_ip_forward to 0. Any deployments still
relying on the previous value may set
neutron_l3_agent_host_ipv4_ip_forward to 1.
Related-Bug: #1945453
Change-Id: I9b39307ad8d6c51e215fe3d3bc56aab998d218ec
Ironic has changed the default PXE to be iPXE (as opposed to plain
PXE) in Yoga. Kolla Ansible supports either one or the other and
we tend to stick to upstream defaults so this change enables
iPXE instead of plain PXE - by default - the users are allowed
to change back and they need to take one other action so it is
good to remind them via upgrade notes either way.
Change-Id: If14ec83670d2212906c6e22c7013c475f3c4748a
When OpenStack is deployed with Kolla-Ansible, by default there
are no durable queues or exchanges created by the OpenStack
services in RabbitMQ. In Rabbit terminology, not being durable
is referred to as `transient`, and this means that the queue
is generally held in memory.
Whether OpenStack services create durable or transient queues is
traditionally controlled by the Oslo Notification config option:
`amqp_durable_queues`. In Kolla-Ansible, this remains set to
the default of `False` in all services. The only `durable`
objects are the `amq*` exchanges which are internal to RabbitMQ.
More recently, Oslo Notification has introduced support for
Quorum queues [7]. These are a successor to durable classic
queues, however it isn't yet clear if they are a good fit for
OpenStack in general [8].
For clustered RabbitMQ deployments, Kolla-Ansible configures all
queues as `replicated` [1]. Replication occurs over all nodes
in the cluster. RabbitMQ refers to this as 'mirroring of classic
queues'.
In summary, this means that a multi-node Kolla-Ansible deployment
will end up with a large number of transient, mirrored queues
and exchanges. However, the RabbitMQ documentation warns against
this, stating that 'For replicated queues, the only reasonable
option is to use durable queues: [2]`. This is discussed
further in the following bug report: [3].
Whilst we could try enabling the `amqp_durable_queues` option
for each service (this is suggested in [4]), there are
a number of complexities with this approach, not limited to:
1) RabbitMQ is planning to remove classic queue mirroring in
favor of 'Quorum queues' in a forthcoming release [5].
2) Durable queues will be written to disk, which may cause
performance problems at scale. Note that this includes
Quorum queues which are always durable.
3) Potential for race conditions and other complexity
discussed recently on the mailing list under:
`[ops] [kolla] RabbitMQ High Availability`
The remaining option, proposed here, is to use classic
non-mirrored queues everywhere, and rely on services to recover
if the node hosting a queue or exchange they are using fails.
There is some discussion of this approach in [6]. The downside
of potential message loss needs to be weighed against the real
upsides of increasing the performance of RabbitMQ, and moving
to a configuration which is officially supported and hopefully
more stable. In the future, we can then consider promoting
specific queues to quorum queues, in cases where message loss
can result in failure states which are hard to recover from.
[1] https://www.rabbitmq.com/ha.html
[2] https://www.rabbitmq.com/queues.html
[3] https://github.com/rabbitmq/rabbitmq-server/issues/2045
[4] https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit
[5] https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/
[6] https://fuel-ccp.readthedocs.io/en/latest/design/ref_arch_1000_nodes.html#replication
[7] https://bugs.launchpad.net/oslo.messaging/+bug/1942933
[8] https://www.rabbitmq.com/quorum-queues.html#use-cases
Partial-Bug: #1954925
Change-Id: I91d0e23b22319cf3fdb7603f5401d24e3b76a56e
The Prometheus HTTP API is reachable under /api/v1. Without this fix,
CloudKitty receives 404 errors from Prometheus.
Change-Id: Ie872da5ccddbcb8028b8b57022e2427372ed474e
Without this configuration, all mount points are reporting the same
utilisation metrics [1]. With the rslave option, all root mounts from
the host are visible in the container, so we can remove the bind mounts
for /proc and /sys.
[1] https://github.com/prometheus/node_exporter#docker
Change-Id: I4087dc81f9d1fa5daa24b9df6daf1f9e1ccd702f
Closes-Bug: #1961438
An FCD, also known as an Improved Virtual Disk (IVD) or
Managed Virtual Disk, is a named virtual disk independent of
a virtual machine. Using FCDs for Cinder volumes eliminates
the need for shadow virtual machines.
This patch adds Kolla support.
Change-Id: Ic0b66269e6d32762e786c95cf6da78cb201d2765
NSXP is the OpenStack support for the NSX Policy platform.
This is supported from neutron in the Stein version. This patch
adds Kolla support
This adds a new neutron_plugin_agent type 'vmware_nsxp'. The plugin
does not run any neutron agents.
Change-Id: I9e9d8f07e586bdc143d293e572031368af7f3fca
The bootloader used to boot Ironic nodes in UEFI boot mode during
inspection when iPXE is enabled has been changed from ipxe.efi to
snponly.efi. This is in line with the default UEFI iPXE bootloader used
in Ironic since the Xena release. The bootloader may be changed via
ironic_dnsmasq_uefi_ipxe_boot_file.
Note that snponly.efi was not available via in the ironic-pxe image
prior to I79e78dca550262fc86b092a036f9ea96b214ab48.
Related-Bug: #1959203
Change-Id: I879db340769cc1b076e77313dff15876e27fcac4
Allow operators to set haproxy socket to admin level.
This is done via the flag haproxy_socket_level_admin which
is set to "no" by default.
Closes-Bug: 1960215
Signed-off-by: Imran Hussain <ih@imranh.co.uk>
Change-Id: Ia0da89288d68f5803ace1934c013053f12343195