64 Commits

Author SHA1 Message Date
Mark Goddard
9702d4c3c3 Performance: use import_tasks for check-containers.yml
Including tasks has a performance penalty when compared with importing
tasks. If the include has a condition associated with it, then the
overhead of the include may be lower than the overhead of skipping all
imported tasks. In the case of the check-containers.yml include, the
included file only has a single task, so the overhead of skipping this
task will not be greater than the overhead of the task import. It
therefore makes sense to switch to use import_tasks there.

Partially-Implements: blueprint performance-improvements

Change-Id: I65d911670649960708b9f6a4c110d1a7df1ad8f7
2020-07-28 12:10:59 +01:00
Zuul
b0407ffb17 Merge "Make /dev/kvm permissions handling more robust" 2020-07-22 12:32:40 +00:00
Radosław Piliszek
202365e702 Make /dev/kvm permissions handling more robust
This makes use of udev rules to make it smarter and override
host-level packages settings.
Additionally, this masks Ubuntu-only service that is another
pain point in terms of /dev/kvm permissions.
Fingers crossed for no further surprises.

Change-Id: I61235b51e2e1325b8a9b4f85bf634f663c7ec3cc
Closes-bug: #1681461
2020-07-17 17:51:18 +00:00
Zuul
9a8341c2a7 Merge "Performance: Run common role in a separate play" 2020-07-17 15:43:22 +00:00
Mark Goddard
2f91be9f39 Load br_netfilter module in nova-cell role
The nova-cell role sets the following sysctls on compute hosts, which
require the br_netfilter kernel module to be loaded:

    net.bridge.bridge-nf-call-iptables
    net.bridge.bridge-nf-call-ip6tables

If it is not loaded, then we see the following errors:

    Failed to reload sysctl:
    sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory
    sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory

Loading the br_netfilter module resolves this issue.

Typically we do not see this since installing Docker and configuring it
to manage iptables rules causes the br_netfilter module to be loaded.
There are good reasons [1] to disable Docker's iptables management
however, in which case we are likely to hit this issue.

This change loads the br_netfilter module in the nova-cell role for
compute hosts.

[1] https://bugs.launchpad.net/kolla-ansible/+bug/1849275

Co-Authored-By: Dincer Celik <hello@dincercelik.com>

Change-Id: Id52668ba8dab460ad4c33fad430fc8611e70825e
2020-07-08 11:13:39 +01:00
Mark Goddard
56ae2db7ac Performance: Run common role in a separate play
The common role was previously added as a dependency to all other roles.
It would set a fact after running on a host to avoid running twice. This
had the nice effect that deploying any service would automatically pull
in the common services for that host. When using tags, any services with
matching tags would also run the common role. This could be both
surprising and sometimes useful.

When using Ansible at large scale, there is a penalty associated with
executing a task against a large number of hosts, even if it is skipped.
The common role introduces some overhead, just in determining that it
has already run.

This change extracts the common role into a separate play, and removes
the dependency on it from all other roles. New groups have been added
for cron, fluentd, and kolla-toolbox, similar to other services. This
changes the behaviour in the following ways:

* The common role is now run for all hosts at the beginning, rather than
  prior to their first enabled service
* Hosts must be in the necessary group for each of the common services
  in order to have that service deployed. This is mostly to avoid
  deploying on localhost or the deployment host
* If tags are specified for another service e.g. nova, the common role
  will *not* automatically run for matching hosts. The common tag must
  be specified explicitly

The last of these is probably the largest behaviour change. While it
would be possible to determine which hosts should automatically run the
common role, it would be quite complex, and would introduce some
overhead that would probably negate the benefit of splitting out the
common role.

Partially-Implements: blueprint performance-improvements

Change-Id: I6a4676bf6efeebc61383ec7a406db07c7a868b2a
2020-07-07 15:00:47 +00:00
Pierre Riteau
c40e806587 Remove policy file from nova-conductor config.json template
Change I810aad7d49db3f5a7fd9a2f0f746fd912fe03917 for supporting multiple
Nova cells updated the list of containers that require a policy file to
only include nova-api, nova-compute, and nova-compute-ironic.

The nova-conductor config.json template was left unchanged and fails to
copy the nova policy file into its container. This can be seen on a
fresh deployment, but might be missed on an upgrade if an older policy
file is still available in /etc/kolla/nova-conductor.

This commit removes the nova_policy_file block from the nova-conductor
config.json template, as it shouldn't be required.

Backport: ussuri, train
Change-Id: I17256b182d207aeba3f92c65a6d7cf3611180558
Closes-Bug: #1886170
2020-07-03 12:52:57 +02:00
wu.chunyang
a9c94aee39 nova-cell role clone failed
when enable kolla_dev_mod, nova-cell role clones code failed,
because we use nova-cell repository which is not exists.
in fact, nova-cell role should use nova repository too

Change-Id: I7fa62726d0d5b0aeb3bd5fa06dc0e59667f94fa0
2020-06-22 22:12:11 +08:00
gugug
f220970d46 Clean up the unnecessary "" for include_tasks
The double quotation is not necessary for include_tasks, this
ps to cleanup it.

Change-Id: I0701035d185fdf19286cced7fe51fc277511e4c1
2020-06-16 23:36:42 +08:00
Zuul
e74cada7c1 Merge "permission denied when enable_kolla_dev_mod" 2020-06-10 02:32:45 +00:00
Christian Berendt
60e03d7bf3 Remove XenAPI integration
Change-Id: Iea3f4f3d2e5c6040c1e0bc7bfae8719cc7d8ac55
2020-06-09 13:56:17 +02:00
wu.chunyang
3e9a648601 permission denied when enable_kolla_dev_mod
non-root user has no permission to create directory under /opt
directory. use "become: true" to resolve it.

Change-Id: I155efc4b1e0691da0aaf6ef19ca709e9dc2d9168
2020-06-07 19:36:42 +08:00
Zuul
c07ee9af4f Merge "Configure RabbitMQ user tags in nova-cell role" 2020-05-17 12:26:56 +00:00
Jeffrey Zhang
869e3f21c2 Configure RabbitMQ user tags in nova-cell role
The RabbitMQ 'openstack' user has the 'administrator' tag assigned via
the RabbitMQ definitions.json file.

Since the Train release, the nova-cell role also configures the RabbitMQ
user, but omits the tag. This causes the tag to be removed from the
user, which prevents it from accessing the management UI and API.

This change adds support for configuring user tags to the
service-rabbitmq role, and sets the administrator tag by default.

Change-Id: I7a5d6fe324dd133e0929804d431583e5b5c1853d
Closes-Bug: #1875786
2020-05-15 16:02:46 +01:00
Radosław Piliszek
93c9ad892c Make nova perms consistent between applications
Nova cells support introduced a slight regression that triggers
odd behaviour when we tried switching to Apache (httpd) [1].
Bootstrap no longer applied permissions recursively to all log
files, creating a discrepancy between normal and bootstrap runs
and also Nova and other services such as Cinder (regarding
bootstrap logging).

This patch fixes it.

Backport to Train.

Not creating reno nor a bug record because it does not affect
any current standard usage in any currently known way.

Note this only really hides (standardizes?) the global issue that
we don't control file permissions on newly created files too well.

[1] https://review.opendev.org/724793

Change-Id: I35e9924ccede5edd2e1307043379aba944725143
Needed-By: https://review.opendev.org/724793
2020-05-06 18:36:10 +00:00
Zuul
76d69cae0e Merge "Fix nova cell message queue URL with separate notification queue" 2020-04-26 16:46:35 +00:00
Zuul
4d49397d72 Merge "nova: Add debug logging to libvirtd.conf" 2020-04-26 15:46:29 +00:00
Zuul
7a193d1f06 Merge "Ansible lint: lines longer than 160 chars" 2020-04-17 09:29:00 +00:00
Zuul
87984f5425 Merge "Add Ansible group check to prechecks" 2020-04-16 15:33:46 +00:00
Zuul
2e2672e753 Merge "Fix nova compute addition with limit" 2020-04-16 15:33:44 +00:00
Zuul
7f42813159 Merge "Refactor copy certificates task" 2020-04-16 14:03:37 +00:00
Michal Nasiadka
d403690b88 Ansible lint: lines longer than 160 chars
Change-Id: I500cc8800c412bc0e95edb15babad5c1189e6ee4
2020-04-16 15:59:06 +02:00
Mark Goddard
e8ad5f37d4 Fix nova cell message queue URL with separate notification queue
If using a separate message queue for nova notifications, i.e.
nova_cell_notify_transport_url is different from
nova_cell_rpc_transport_url, then Kolla Ansible will unnecessarily
update the cell. This should not cause any issues since the URL is taken
from nova.conf.

This change fixes the comparison to use the correct URL.

Change-Id: I5f0e30957bfd70295f2c22c86349ebbb4c1fb155
Closes-Bug: #1873255
2020-04-16 12:32:40 +01:00
Michal Nasiadka
87a1c06b84 nova: Add debug logging to libvirtd.conf
Change-Id: Ibbb962b035b695eec022566cf9f7d6c200480c45
2020-04-15 17:05:57 +02:00
Mark Goddard
3af28d2151 Fix nova compute addition with limit
Deploy a small cloud. Add one host to the compute group in the
inventory, and scale out:

$ kolla-ansible deploy --limit <new compute host>

The command succeeds, but creating an instance fails with the following:

    Host 'compute0' is not mapped to any cell

This happens because we only discover computes on the first host in the
cell's nova conductor group. If that host is not in the specified limit,
the discovery will not happen.

This change fixes the issue by running compute discovery when any ironic
or virtualised compute hosts are in the play batch, and delegating it to
a conductor.

Change-Id: Ie984806240d147add825ffa8446ae6ff55ca4814
Closes-Bug: #1869371
2020-04-14 19:36:49 +00:00
James Kirsch
4d155d69cd Refactor copy certificates task
Refactor service configuration to use the copy certificates task. This
reduces code duplication and simplifies implementing encrypting backend
HAProxy traffic for individual services.

Change-Id: I0474324b60a5f792ef5210ab336639edf7a8cd9e
2020-04-14 17:26:19 +00:00
Zuul
969159cc17 Merge "Fix live migration to use migration int. address" 2020-04-12 06:14:09 +00:00
Zuul
9d217e92aa Merge "Introduce /etc/timezone to Debian/Ubuntu containers" 2020-04-10 10:38:37 +00:00
Dincer Celik
4b5df0d866 Introduce /etc/timezone to Debian/Ubuntu containers
Some services look for /etc/timezone on Debian/Ubuntu, so we should
introduce it to the containers.

In addition, added prechecks for /etc/localtime and /etc/timezone.

Closes-Bug: #1821592
Change-Id: I9fef14643d1bcc7eee9547eb87fa1fb436d8a6b3
2020-04-09 18:53:36 +00:00
John Garbutt
628c27ce9e Fix live migration to use migration int. address
In kolla ansible we typically configure services to communicate via IP
addresses rather than hostnames. One accidental exception to this was
live migration, which used the hostname of the destination even when
not required (i.e. TLS not being used for libvirt).

To make such hostnames work, k-a adds entries to /etc/hosts in the
bootstrap-servers command. Alternatively users may provide DNS.

One problem with using /etc/hosts is that, if a new compute host is
added to the cloud, or an IP address is changed, that will not be
reflected in the /etc/hosts file of other hosts. This would cause live
migration to the new host from an old host to fail, as the name cannot
be resolved.

The workaround for this was to update the /etc/hosts file (perhaps via
bootstrap-servers) on all hosts after adding new compute hosts. Then the
nova_libvirt container had to be restarted to pick up the change.

Similarly, if user has overridden the migration_interface, the used
hostname could point to a wrong address on which libvirt would not
listen.

This change adds the live_migration_inbound_addr option to nova.conf. If
TLS is not in use for libvirt, this will be set to the IP address of the
host on the migration network. If TLS is enabled for libvirt,
live_migration_inbound_addr will be set to migration_hostname, since
certificates will typically reference the hostname rather than the
host's IP. With libvirt TLS enabled, DNS is recommended to avoid the
/etc/hosts issue which is likely the case in production deployments.

Change-Id: I0201b46a9fbab21433a9f53685131aeb461543a8
Closes-Bug: #1729566
2020-04-09 18:17:07 +00:00
Mark Goddard
1d70f509e3 Perform host configuration during upgrade
This is a follow up to I001defc75d1f1e6caa9b1e11246abc6ce17c775b. To
maintain previous behaviour, and ensure we catch any host configuration
changes, we should perform host configuration during upgrade.

Change-Id: I79fcbf1efb02b7187406d3c3fccea6f200bcea69
Related-Bug: #1860161
2020-04-08 17:03:22 +01:00
Zuul
7c92e56cfd Merge "Separate per-service host configuration tasks" 2020-04-05 16:40:27 +00:00
Mark Goddard
0edad7138c Remove default(omit) from openstack_cacert in templates
The use of default(omit) is for module parameters, not templates. We
define a default value for openstack_cacert, so it should never be
undefined anyway.

Change-Id: Idfa73097ca168c76559dc4f3aa8bb30b7113ab28
2020-04-03 14:49:11 +01:00
Mark Goddard
fdea19a305 Separate per-service host configuration tasks
Currently there are a few services that perform host configuration
tasks. This is done in config.yml. This means that these changes are
performed during 'kolla-ansible genconfig', when we might expect not to
be making any changes to the remote system.

This change separates out these host configuration tasks into a
config-host.yml file, which is included directly from deploy.yml.

One change in behaviour is that this prevents these tasks from running
during an upgrade or genconfig. This is probably what we want, but we
should be careful when any of these host configuration tasks are
changed, to ensure they are applied during an upgrade if necessary.

Change-Id: I001defc75d1f1e6caa9b1e11246abc6ce17c775b
Closes-Bug: #1860161
2020-04-02 13:51:56 +00:00
Zuul
2a2ce059dc Merge "Add notify restart container when cert changed" 2020-03-10 12:12:55 +00:00
yj.bai
d3cc2f670e Add notify restart container when cert changed
When change the cert file in /etc/kolla/certificate/.
The certificate in the container has not changed.
So I think can use kolla-ansible deploy when certificate is
changed. restart <container>

Partially-Implements: blueprint custom-cacerts

Change-Id: Iaac6f37e85ffdc0352e8062ae5049cc9a6b3db26
Signed-off-by: yj.bai <bai.yongjun@99cloud.net>
2020-03-10 16:23:09 +08:00
Radosław Piliszek
266fd61ad7 Use "name:" instead of "role:" for *_role modules
Both include_role and import_role expect role's name to be given
via "name" param instead of "role".
This worked but caused errors with ansible-lint.
See: https://review.opendev.org/694779

Change-Id: I388d4ae27111e430d38df1abcb6c6127d90a06e0
2020-03-02 10:01:17 +01:00
Mark Goddard
49fb55f182 Add Ansible group check to prechecks
We assume that all groups are present in the inventory, and quite obtuse
errors can result if any are not.

This change adds a precheck that checks for the presence of all expected
groups in the inventory for each service. It also introduces a common
service-precheck role that we can use for other common prechecks.

Change-Id: Ia0af1e7df4fff7f07cd6530e5b017db8fba530b3
Partially-Implements: blueprint improve-prechecks
2020-02-28 16:23:14 +00:00
Michal Nasiadka
4e6fe7a6da Remove kolla-ceph
Kolla-Ansible Ceph deployment mechanism has been deprecated in Train [1].

This change removes the Ansible code and associated CI jobs.

[1]: https://review.opendev.org/669214

Change-Id: Ie2167f02ad2f525d3b0f553e2c047516acf55bc2
2020-02-11 11:42:06 +01:00
Zuul
b3c8ff59f1 Merge "Copy CA into containers." 2020-02-07 17:25:01 +00:00
Zuul
666b58b383 Merge "Python 3: Use distro_python_version for dev mode" 2020-02-04 13:40:31 +00:00
Zuul
b9b8aaa02a Merge "Fix qemu loading of ceph.conf (permission error)" 2020-02-01 12:00:55 +00:00
Ning Yao
91910d2a45 Fix qemu loading of ceph.conf (permission error)
ceph.conf is loaded by qemu, not libvirt.
Since qemu runs as the nova user, ceph.conf owned by root
causes a permission error. The logs in
/var/log/libvirt/qemu/instance-*.log reveal the error.

This change fixes the issue by changing the ownership of ceph.conf
in nova-libvirt to the nova user.

Closes-Bug: #1861513
Change-Id: I1881f51a6c8508f0f186a5623443343dc1df41d4
Signed-off-by: Ning Yao <yaoning@unitedstack.com>
2020-01-31 17:50:50 +01:00
Mark Goddard
5a786436be Python 3: Use distro_python_version for dev mode
In dev mode currently the python source is mounted under python2.7
site-packages. This change fixes this to use the distro_python_version
variable to ensure dev mode works with Python 3 images.

Change-Id: Ieae3778a02f1b79023b4f1c20eff27b37f481077
Partially-Implements: blueprint python-3
2020-01-30 14:00:34 +00:00
Michal Nasiadka
fdf3729f83 External Ceph: add ceph_*_user variables
To make the configuration easier for the user, and to allow non-standard
ceph authentication ids - introduce ceph_*_user variables.

Change-Id: I24e01c43c826b62b6748d93a498f4b7d8ce9e309
2020-01-29 11:06:58 +00:00
James Kirsch
511ba9f6a2 Copy CA into containers.
When kolla_copy_ca_into_containers is set to "yes", the Certificate
Authority in /etc/kolla/certificates will be copied into service
containers to enable trust for that CA. This is especially useful when
the CA is self signed, and would not be trusted by default.

Partially-Implements: blueprint custom-cacerts

Change-Id: I4368f8994147580460ebe7533850cf63a419d0b4
2020-01-28 14:03:32 -08:00
Zuul
13dea3f931 Merge "External Ceph: keys as variables" 2020-01-23 12:43:43 +00:00
Michal Nasiadka
1f929336e3 External Ceph: keys as variables
Introduce user modifiable variables instead of fixed-names
of Ceph keyring files for external Ceph functionality.

Change-Id: I1a33b3f9d6eca5babf53b91187461e43aef865ce
2020-01-22 18:16:38 +00:00
Zuul
5126087af5 Merge "CentOS 8: Support variable image tag suffix" 2020-01-21 09:29:58 +00:00
Zuul
2c2eeb8159 Merge "Configure services to use Certificate Authority" 2020-01-15 22:16:30 +00:00