Including tasks has a performance penalty when compared with importing
tasks. If the include has a condition associated with it, then the
overhead of the include may be lower than the overhead of skipping all
imported tasks. In the case of the check-containers.yml include, the
included file only has a single task, so the overhead of skipping this
task will not be greater than the overhead of the task import. It
therefore makes sense to switch to use import_tasks there.
Partially-Implements: blueprint performance-improvements
Change-Id: I65d911670649960708b9f6a4c110d1a7df1ad8f7
This makes use of udev rules to make it smarter and override
host-level packages settings.
Additionally, this masks Ubuntu-only service that is another
pain point in terms of /dev/kvm permissions.
Fingers crossed for no further surprises.
Change-Id: I61235b51e2e1325b8a9b4f85bf634f663c7ec3cc
Closes-bug: #1681461
The nova-cell role sets the following sysctls on compute hosts, which
require the br_netfilter kernel module to be loaded:
net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-ip6tables
If it is not loaded, then we see the following errors:
Failed to reload sysctl:
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory
Loading the br_netfilter module resolves this issue.
Typically we do not see this since installing Docker and configuring it
to manage iptables rules causes the br_netfilter module to be loaded.
There are good reasons [1] to disable Docker's iptables management
however, in which case we are likely to hit this issue.
This change loads the br_netfilter module in the nova-cell role for
compute hosts.
[1] https://bugs.launchpad.net/kolla-ansible/+bug/1849275
Co-Authored-By: Dincer Celik <hello@dincercelik.com>
Change-Id: Id52668ba8dab460ad4c33fad430fc8611e70825e
The common role was previously added as a dependency to all other roles.
It would set a fact after running on a host to avoid running twice. This
had the nice effect that deploying any service would automatically pull
in the common services for that host. When using tags, any services with
matching tags would also run the common role. This could be both
surprising and sometimes useful.
When using Ansible at large scale, there is a penalty associated with
executing a task against a large number of hosts, even if it is skipped.
The common role introduces some overhead, just in determining that it
has already run.
This change extracts the common role into a separate play, and removes
the dependency on it from all other roles. New groups have been added
for cron, fluentd, and kolla-toolbox, similar to other services. This
changes the behaviour in the following ways:
* The common role is now run for all hosts at the beginning, rather than
prior to their first enabled service
* Hosts must be in the necessary group for each of the common services
in order to have that service deployed. This is mostly to avoid
deploying on localhost or the deployment host
* If tags are specified for another service e.g. nova, the common role
will *not* automatically run for matching hosts. The common tag must
be specified explicitly
The last of these is probably the largest behaviour change. While it
would be possible to determine which hosts should automatically run the
common role, it would be quite complex, and would introduce some
overhead that would probably negate the benefit of splitting out the
common role.
Partially-Implements: blueprint performance-improvements
Change-Id: I6a4676bf6efeebc61383ec7a406db07c7a868b2a
Change I810aad7d49db3f5a7fd9a2f0f746fd912fe03917 for supporting multiple
Nova cells updated the list of containers that require a policy file to
only include nova-api, nova-compute, and nova-compute-ironic.
The nova-conductor config.json template was left unchanged and fails to
copy the nova policy file into its container. This can be seen on a
fresh deployment, but might be missed on an upgrade if an older policy
file is still available in /etc/kolla/nova-conductor.
This commit removes the nova_policy_file block from the nova-conductor
config.json template, as it shouldn't be required.
Backport: ussuri, train
Change-Id: I17256b182d207aeba3f92c65a6d7cf3611180558
Closes-Bug: #1886170
when enable kolla_dev_mod, nova-cell role clones code failed,
because we use nova-cell repository which is not exists.
in fact, nova-cell role should use nova repository too
Change-Id: I7fa62726d0d5b0aeb3bd5fa06dc0e59667f94fa0
non-root user has no permission to create directory under /opt
directory. use "become: true" to resolve it.
Change-Id: I155efc4b1e0691da0aaf6ef19ca709e9dc2d9168
The RabbitMQ 'openstack' user has the 'administrator' tag assigned via
the RabbitMQ definitions.json file.
Since the Train release, the nova-cell role also configures the RabbitMQ
user, but omits the tag. This causes the tag to be removed from the
user, which prevents it from accessing the management UI and API.
This change adds support for configuring user tags to the
service-rabbitmq role, and sets the administrator tag by default.
Change-Id: I7a5d6fe324dd133e0929804d431583e5b5c1853d
Closes-Bug: #1875786
Nova cells support introduced a slight regression that triggers
odd behaviour when we tried switching to Apache (httpd) [1].
Bootstrap no longer applied permissions recursively to all log
files, creating a discrepancy between normal and bootstrap runs
and also Nova and other services such as Cinder (regarding
bootstrap logging).
This patch fixes it.
Backport to Train.
Not creating reno nor a bug record because it does not affect
any current standard usage in any currently known way.
Note this only really hides (standardizes?) the global issue that
we don't control file permissions on newly created files too well.
[1] https://review.opendev.org/724793
Change-Id: I35e9924ccede5edd2e1307043379aba944725143
Needed-By: https://review.opendev.org/724793
If using a separate message queue for nova notifications, i.e.
nova_cell_notify_transport_url is different from
nova_cell_rpc_transport_url, then Kolla Ansible will unnecessarily
update the cell. This should not cause any issues since the URL is taken
from nova.conf.
This change fixes the comparison to use the correct URL.
Change-Id: I5f0e30957bfd70295f2c22c86349ebbb4c1fb155
Closes-Bug: #1873255
Deploy a small cloud. Add one host to the compute group in the
inventory, and scale out:
$ kolla-ansible deploy --limit <new compute host>
The command succeeds, but creating an instance fails with the following:
Host 'compute0' is not mapped to any cell
This happens because we only discover computes on the first host in the
cell's nova conductor group. If that host is not in the specified limit,
the discovery will not happen.
This change fixes the issue by running compute discovery when any ironic
or virtualised compute hosts are in the play batch, and delegating it to
a conductor.
Change-Id: Ie984806240d147add825ffa8446ae6ff55ca4814
Closes-Bug: #1869371
Refactor service configuration to use the copy certificates task. This
reduces code duplication and simplifies implementing encrypting backend
HAProxy traffic for individual services.
Change-Id: I0474324b60a5f792ef5210ab336639edf7a8cd9e
Some services look for /etc/timezone on Debian/Ubuntu, so we should
introduce it to the containers.
In addition, added prechecks for /etc/localtime and /etc/timezone.
Closes-Bug: #1821592
Change-Id: I9fef14643d1bcc7eee9547eb87fa1fb436d8a6b3
In kolla ansible we typically configure services to communicate via IP
addresses rather than hostnames. One accidental exception to this was
live migration, which used the hostname of the destination even when
not required (i.e. TLS not being used for libvirt).
To make such hostnames work, k-a adds entries to /etc/hosts in the
bootstrap-servers command. Alternatively users may provide DNS.
One problem with using /etc/hosts is that, if a new compute host is
added to the cloud, or an IP address is changed, that will not be
reflected in the /etc/hosts file of other hosts. This would cause live
migration to the new host from an old host to fail, as the name cannot
be resolved.
The workaround for this was to update the /etc/hosts file (perhaps via
bootstrap-servers) on all hosts after adding new compute hosts. Then the
nova_libvirt container had to be restarted to pick up the change.
Similarly, if user has overridden the migration_interface, the used
hostname could point to a wrong address on which libvirt would not
listen.
This change adds the live_migration_inbound_addr option to nova.conf. If
TLS is not in use for libvirt, this will be set to the IP address of the
host on the migration network. If TLS is enabled for libvirt,
live_migration_inbound_addr will be set to migration_hostname, since
certificates will typically reference the hostname rather than the
host's IP. With libvirt TLS enabled, DNS is recommended to avoid the
/etc/hosts issue which is likely the case in production deployments.
Change-Id: I0201b46a9fbab21433a9f53685131aeb461543a8
Closes-Bug: #1729566
This is a follow up to I001defc75d1f1e6caa9b1e11246abc6ce17c775b. To
maintain previous behaviour, and ensure we catch any host configuration
changes, we should perform host configuration during upgrade.
Change-Id: I79fcbf1efb02b7187406d3c3fccea6f200bcea69
Related-Bug: #1860161
The use of default(omit) is for module parameters, not templates. We
define a default value for openstack_cacert, so it should never be
undefined anyway.
Change-Id: Idfa73097ca168c76559dc4f3aa8bb30b7113ab28
Currently there are a few services that perform host configuration
tasks. This is done in config.yml. This means that these changes are
performed during 'kolla-ansible genconfig', when we might expect not to
be making any changes to the remote system.
This change separates out these host configuration tasks into a
config-host.yml file, which is included directly from deploy.yml.
One change in behaviour is that this prevents these tasks from running
during an upgrade or genconfig. This is probably what we want, but we
should be careful when any of these host configuration tasks are
changed, to ensure they are applied during an upgrade if necessary.
Change-Id: I001defc75d1f1e6caa9b1e11246abc6ce17c775b
Closes-Bug: #1860161
When change the cert file in /etc/kolla/certificate/.
The certificate in the container has not changed.
So I think can use kolla-ansible deploy when certificate is
changed. restart <container>
Partially-Implements: blueprint custom-cacerts
Change-Id: Iaac6f37e85ffdc0352e8062ae5049cc9a6b3db26
Signed-off-by: yj.bai <bai.yongjun@99cloud.net>
Both include_role and import_role expect role's name to be given
via "name" param instead of "role".
This worked but caused errors with ansible-lint.
See: https://review.opendev.org/694779
Change-Id: I388d4ae27111e430d38df1abcb6c6127d90a06e0
We assume that all groups are present in the inventory, and quite obtuse
errors can result if any are not.
This change adds a precheck that checks for the presence of all expected
groups in the inventory for each service. It also introduces a common
service-precheck role that we can use for other common prechecks.
Change-Id: Ia0af1e7df4fff7f07cd6530e5b017db8fba530b3
Partially-Implements: blueprint improve-prechecks
Kolla-Ansible Ceph deployment mechanism has been deprecated in Train [1].
This change removes the Ansible code and associated CI jobs.
[1]: https://review.opendev.org/669214
Change-Id: Ie2167f02ad2f525d3b0f553e2c047516acf55bc2
ceph.conf is loaded by qemu, not libvirt.
Since qemu runs as the nova user, ceph.conf owned by root
causes a permission error. The logs in
/var/log/libvirt/qemu/instance-*.log reveal the error.
This change fixes the issue by changing the ownership of ceph.conf
in nova-libvirt to the nova user.
Closes-Bug: #1861513
Change-Id: I1881f51a6c8508f0f186a5623443343dc1df41d4
Signed-off-by: Ning Yao <yaoning@unitedstack.com>
In dev mode currently the python source is mounted under python2.7
site-packages. This change fixes this to use the distro_python_version
variable to ensure dev mode works with Python 3 images.
Change-Id: Ieae3778a02f1b79023b4f1c20eff27b37f481077
Partially-Implements: blueprint python-3
To make the configuration easier for the user, and to allow non-standard
ceph authentication ids - introduce ceph_*_user variables.
Change-Id: I24e01c43c826b62b6748d93a498f4b7d8ce9e309
When kolla_copy_ca_into_containers is set to "yes", the Certificate
Authority in /etc/kolla/certificates will be copied into service
containers to enable trust for that CA. This is especially useful when
the CA is self signed, and would not be trusted by default.
Partially-Implements: blueprint custom-cacerts
Change-Id: I4368f8994147580460ebe7533850cf63a419d0b4
Introduce user modifiable variables instead of fixed-names
of Ceph keyring files for external Ceph functionality.
Change-Id: I1a33b3f9d6eca5babf53b91187461e43aef865ce