After all of the discussions we had on
"https://review.opendev.org/#/c/670626/2", I studied all projects that
have an "oslo_messaging" section. Afterwards, I applied the same method
that is already used in "oslo_messaging" section in Nova, Cinder, and
others. This guarantees that we have a consistent method to
enable/disable notifications across projects based on components (e.g.
Ceilometer) being enabled or disabled. Here follows the list of
components, and the respective changes I did.
* Aodh:
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.
* Congress:
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.
* Cinder:
It was already properly configured.
* Octavia:
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.
* Heat:
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components
* Ceilometer:
Ceilometer publishes some messages in the rabbitMQ. However, the
default driver is "messagingv2", and not ''(empty) as defined in Oslo;
these configurations are defined in ceilometer/publisher/messaging.py.
Therefore, we do not need to do anything for the
"oslo_messaging_notifications" section in Ceilometer
* Tacker:
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components
* Neutron:
It was already properly configured.
* Nova
It was already properly configured. However, we found another issue
with its configuration. Kolla-ansible does not configure nova
notifications as it should. If 'searchlight' is not installed (enabled)
the 'notification_format' should be 'unversioned'. The default is
'both'; so nova will send a notification to the queue
versioned_notifications; but that queue has no consumer when
'searchlight' is disabled. In our case, the queue got 511k messages.
The huge amount of "stuck" messages made the Rabbitmq cluster
unstable.
https://bugzilla.redhat.com/show_bug.cgi?id=1478274https://bugs.launchpad.net/ceilometer/+bug/1665449
* Nova_hyperv:
I added the same configurations as in Nova project.
* Vitrage
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components
* Searchlight
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.
* Ironic
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.
* Glance
It was already properly configured.
* Trove
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components
* Blazar
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components
* Sahara
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components
* Watcher
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.
* Barbican
I created a mechanism similar to what we have in Cinder, Nova,
and others. I also added a configuration to 'keystone_notifications'
section. Barbican needs its own queue to capture events from Keystone.
Otherwise, it has an impact on Ceilometer and other systems that are
connected to the "notifications" default queue.
* Keystone
Keystone is the system that triggered this work with the discussions
that followed on https://review.opendev.org/#/c/670626/2. After a long
discussion, we agreed to apply the same approach that we have in Nova,
Cinder and other systems in Keystone. That is what we did. Moreover, we
introduce a new topic "barbican_notifications" when barbican is
enabled. We also removed the "variable" enable_cadf_notifications, as
it is obsolete, and the default in Keystone is CADF.
* Mistral:
It was hardcoded "noop" as the driver. However, that does not seem a
good practice. Instead, I applied the same standard of using the driver
and pushing to "notifications" queue if Ceilometer is enabled.
* Cyborg:
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.
* Murano
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components
* Senlin
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components
* Manila
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components
* Zun
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.
* Designate
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components
* Magnum
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components
Closes-Bug: #1838985
Change-Id: I88bdb004814f37c81c9a9c4e5e491fac69f6f202
Signed-off-by: Rafael Weingärtner <rafael@apache.org>
Docker has no restart policy named 'never'. It has 'no'.
This has bitten us already (see [1]) and might bite us again whenever
we want to change the restart policy to 'no'.
This patch makes our docker integration honor all valid restart policies
and only valid restart policies.
All relevant docker restart policy usages are patched as well.
I added some FIXMEs around which are relevant to kolla-ansible docker
integration. They are not fixed in here to not alter behavior.
[1] https://review.opendev.org/667363
Change-Id: I1c9764fb9bbda08a71186091aced67433ad4e3d6
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
The ironic inspector iPXE configuration includes the following kernel
argument:
initrd=agent.ramdisk
However, the ramdisk is actually called ironic-agent.initramfs, so the
argument should be:
initrd=ironic-agent.initramfs
In BIOS boot mode this does not cause a problem, but for compute nodes
with UEFI enabled, it seems to be more strict about this, and fails to
boot.
Change-Id: Ic84f3b79fdd3cd1730ca2fb79c11c7a4e4d824de
Closes-Bug: #1836375
A common class of problems goes like this:
* kolla-ansible deploy
* Hit a problem, often in ansible/roles/*/tasks/bootstrap.yml
* Re-run kolla-ansible deploy
* Service fails to start
This happens because the DB is created during the first run, but for some
reason we fail before performing the DB sync. This means that on the second run
we don't include ansible/roles/*/tasks/bootstrap_service.yml because the DB
already exists, and therefore still don't perform the DB sync. However this
time, the command may complete without apparent error.
We should be less careful about when we perform the DB sync, and do it whenever
it is necessary. There is an argument for not doing the sync during a
'reconfigure' command, although we will not change that here.
This change only always performs the DB sync during 'deploy' and
'reconfigure' commands.
Change-Id: I82d30f3fcf325a3fdff3c59f19a1f88055b566cc
Closes-Bug: #1823766
Closes-Bug: #1797814
Currently, we have a lot of logic for checking if a handler should run,
depending on whether config files have changed and whether the
container configuration has changed. As rm_work pointed out during
the recent haproxy refactor, these conditionals are typically
unnecessary - we can rely on Ansible's handler notification system
to only trigger handlers when they need to run. This removes a lot
of error prone code.
This patch removes conditional handler logic for all services. It is
important to ensure that we no longer trigger handlers when unnecessary,
because without these checks in place it will trigger a restart of the
containers.
Implements: blueprint simplify-handlers
Change-Id: I4f1aa03e9a9faaf8aecd556dfeafdb834042e4cd
Many tasks that use Docker have become specified already, but
not all. This change ensures all tasks that use the following
modules have become:
* kolla_docker
* kolla_ceph_keyring
* kolla_toolbox
* kolla_container_facts
It also adds become for 'command' tasks that use docker CLI.
Change-Id: I4a5ebcedaccb9261dbc958ec67e8077d7980e496
When ansible goes in to a loop, by default it prints all the keys for
the item it is looping over. Some roles, when setting up the databases,
iterate over an object that includes the database password.
Override the loop label to hide everything but the database name.
Change-Id: I336a81a5ecd824ace7d40e9a35942a1c853554cd
When integrating 3rd party component into openstack with kolla-ansible,
maybe have to mount some extra volumes to container.
Change-Id: I69108209320edad4c4ffa37dabadff62d7340939
Implements: blueprint support-extra-volumes
With Docker CE, the daemon sets the default policy of the iptables
FORWARD chain to DROP. This causes problems for provisioning bare metal
servers when ironic inspector is used with the 'iptables' PXE filter.
It's not entirely clear why these two things interact in this way,
but switching to the 'dnsmasq' filter works around the issue, and is
probably a good move anyway because it is more efficient.
We have added a migration task here to flush and remove the ironic-inspector
iptables chain since inspector does not do this itself currently.
Change-Id: Iceed5a096819203eb2b92466d39575d3adf8e218
Closes-Bug: #1823044
Several config file permissions are incorrect on the host. In general,
files should be 0660, and directories and executables 0770.
Change-Id: Id276ac1864f280554e98b937f2845bb424d521de
Closes-Bug: #1821579
When adding the rolling upgrade support, some upgrade procedures were
modified to pull images explicitly. This is done inconsistently between
services, and is a change in behaviour from Rocky and earlier releases.
This change removes all image pulling from upgrade tasks.
Change-Id: Id0fed17714235e1daed60b83b1f30620f097eb97
This allows ironic service endpoints to use custom hostnames, and adds the
following variables:
* ironic_internal_fqdn
* ironic_external_fqdn
* ironic_inspector_internal_fqdn
* ironic_inspector_external_fqdn
These default to the old values of kolla_internal_fqdn or
kolla_external_fqdn.
This also adds ironic_api_listen_port and ironic_inspector_listen_port
options, which default to ironic_api_port and ironic_inspector_port for
backward compatibility.
These options allow the user to differentiate between the port the
service listens on, and the port the service is reachable on. This is
useful for external load balancers which live on the same host as the
service itself.
Change-Id: I45b175e85866b4cfecad8451b202a5a27f888a84
Implements: blueprint service-hostnames
We're duplicating code to build the keystone URLs in nearly every
config, where we've already done it in group_vars. Replace the
redundancy with a variable that does the same thing.
Change-Id: I207d77870e2535c1cdcbc5eaf704f0448ac85a7a
Adds a new flag, 'enable_openstack_core', which defaults to 'yes'.
Setting this flag to 'no' will disable the core OpenStack services,
including Glance, Heat, Horizon, Keystone, Neutron, and Nova.
Improves the default configuration of OpenStack Ironic when used in
standalone mode. In particular, configures a noauth mode when Keystone
is disabled, and allows the iPXE server to be used for provisioning as
well as inspection if Neutron is disabled.
Documentation for standalone ironic will be updated separately.
This patch was developed and tested using Bikolla [1].
[1] https://github.com/markgoddard/bikolla
Change-Id: Ic47f5ad81b8126a51e52a445097f7950dba233cd
Implements: blueprint standalone-ironic
This allows neutron service endpoints to use custom hostnames, and adds the
following variables:
* neutron_internal_fqdn
* neutron_external_fqdn
These default to the old values of kolla_internal_fqdn or
kolla_external_fqdn.
This also adds a neutron_server_listen_port option, which defaults to
neutron_server_port for backward compatibility.
This option allow the user to differentiate between the port the
service listens on, and the port the service is reachable on. This is
useful for external load balancers which live on the same host as the
service itself.
Change-Id: I87d7387326b6eaa6adae1600b48d480319d10676
Implements: blueprint service-hostnames
This allows glance service endpoints to use custom hostnames, and adds the
following variables:
* glance_internal_fqdn
* glance_external_fqdn
These default to the old values of kolla_internal_fqdn or
kolla_external_fqdn.
This also adds a glance_api_listen_port option, which defaults to
glance_api_port for backward compatibility.
This option allow the user to differentiate between the port the
service listens on, and the port the service is reachable on. This is
useful for external load balancers which live on the same host as the
service itself.
Change-Id: Icb91f728533e2db1908b23dabb0501cf9f8a2b75
Implements: blueprint service-hostnames
The ironic TFTP server should be accessed via the internal API network.
For ironic inspector, dnsmasq.conf advertises this correctly:
dhcp-option=option:tftp-server,'api_interface_address'
dhcp-option=option:server-ip-address,'api_interface_address'
However, ironic conductor does not set the [pxe] tftp_server variable.
This means the TFTP server advertised gets the default value of $my_ip,
which is set by
https://docs.openstack.org/oslo.utils/latest/reference/netutils.html#oslo_utils.netutils.get_my_ipv4,
typically the source IP for the default route.
This change sets [pxe] tftp_server to 'api_interface_address'.
Change-Id: Ic3e688b3f2b92ad9515322f49cd5f4f29d763e49
Closes-Bug: #1808347
With this change, an operator may be able to stop a
service container without stopping all services in a host.
This change is the starting point to start
fast-forward upgrades support.
In next changes new flags will be introducced to disable
stop dataplane services during upgrades.
Change-Id: Ifde7a39d7d8596ef0d7405ecf1ac1d49a459d9ef
Implements: blueprint support-stop-containers
The dnsmasq PXE filter [1] provides far better scalability than the
iptables filter typically used. Inspector manages files in a dhcp-hostsdir
directory that is watched by dnsmasq via inotify. Dnsmasq then either
whitelists or blacklists MAC addresses based on the contents of these
files.
This change adds a new variable, ironic_inspector_pxe_filter, that can
be used to configure the PXE filter for ironic inspector. Currently
supported values are 'iptables' and 'dnsmasq', with 'iptables' being the
default for backwards compatibility.
[1]
https://docs.openstack.org/ironic-inspector/latest/admin/dnsmasq-pxe-filter.html
Implements: blueprint ironic-inspector-dnsmasq-pxe-filter
Change-Id: I73cae9c33b49972342cf1984372a5c784df5cbc2
If the [processing] ramdisk_logs_dir option is set, logs returned by the
ironic inspection ramdisk following hardware inspection will be stored
at that location. This enables easier debugging if inspection fails.
Change-Id: I36bdf75c04b088b67b5f54fdf20251c10bdddb63
The firewall section has been renamed in upstream ironic inspector:
7b27585463
Consequently the iptables pxe filter does not work if the actual
dnsmasq interface name differs from the default (br-ctlplane), as can
be seen from this snippet of iptables-save output:
-A INPUT -i br-ctlplane -p udp -m udp --dport 67 -j ironic-inspector
Change-Id: Ic1d08b85e0b5992fbee489f2f9fd174982b5d493
Having all services in one giant haproxy file makes altering
configuration for a service both painful and dangerous. Each service
should be configured with a simple set of variables and rendered with a
single unified template.
Available are two new templates:
* haproxy_single_service_listen.cfg.j2: close to the original style, but
only one service per file
* haproxy_single_service_split.cfg.j2: using the newer haproxy syntax
for separated frontend and backend
For now the default will be the single listen block, for ease of
transition.
Change-Id: I6e237438fbc0aa3c89a3c8bd706a53b74e71904b
Now kolla dev mode only support clone master branch from git,
add version tag to support clone dedicated branch.
Change-Id: I88de238e5dc7461ba0662a3ecea9a2d80fd0db60
Option auth_uri from group keystone_authtoken is deprecated[1].
Use option www_authenticate_uri from group keystone_authtoken.
[1]https://review.openstack.org/#/c/508522/
Co-Authored-By: confi-surya <singh.surya64mnnit@gmail.com>
Change-Id: Ifd8527d404f1df807ae8196eac2b3849911ddc26
Closes-Bug: #1761907
The variable 'ironic_dnsmasq_interface' is used to configure the interface
used by the ironic inspector dnsmasq service for DHCP on the inspection
network. It is being used correctly in inspector.conf, but not in the
dnsmasq configuration file, which uses api_interface. This change modifies
the dnsmasq configuration file to also use ironic_dnsmasq_interface.
Change-Id: I7670544f4bc41c93ac1d081486502f9ffb8f2286
Closes-Bug: #1785574
Ironic requires the Keystone credentials to communicate with Cinder if
booting from volume.
Change-Id: Id9a90d986e391e84c8ad918af371a5aef33a3524
Closes-Bug: #1785201
This commit is to apply resource-constraints to a few more OpenStack services.
Commit to apply constraints to the last set of services will be made in
the upcoming commit.
Depends-on: Icafa54baca24d2de64238222a5677b9d8b90e2aa
Change-Id: I39004f54281f97d53dfa4b1dbcf248650ad6f186
Both the driver and the enabled_drivers options are being removed
this week. Stop setting them to avoid breakages.
Change-Id: I0e0bf851424b8f5839b159ef83f1cc65c30e2fb3
Add become to all tasks that use the module "kolla_docker"
Change-Id: I4309c4011687b88ec31d739fd8f834fe2326ff10
Partial-Implements: blueprint ansible-specific-task-become
When enable_ironic_ipxe is set in /etc/kolla/globals.yml,
the following happens:
- a new docker container, ironic_ipxe, is created. This contains
an apache webserver used to serve up the boot images
- ironic is configured to use ipxe
Change-Id: I08fca1864a00afb768494406c49e968920c83ae7
Implements: blueprint ironic-ipxe
By now, ironic-dnsmasq has default bootfile pxelinux.0,
which is correct only for x86.
Adding ironic_dnsmasq_boot_file parameter to globals.yml
to make it configuable.
For example: /etc/kolla/globals.yml
ironic_dnsmasq_boot_file: "debian-installer/arm64/bootnetaa64.efi"
Change-Id: I6eb57702d4dad549ef8c999c1c82e577f316d8d6
In change I78cb60168aaa40bb6439198283546b7faf33917c, action was changed
to kolla_action, and serial to kolla_serial, to avoid Ansible warnings
due to use of reserved keywords. In that change, some keywords were
missed, and some changes that were merged since then have not switched
to the new variables. This change fixes all current instances of those
issues.
Change-Id: I357dffdfcb2b405e280a962d366ee65eebf0a8d1
Implements: blueprint migrate-to-ansible-2-2-0