kolla-ansible

Author	SHA1	Message	Date
Mark Goddard	db4fc85c33	Revert "Performance: Use import_tasks in the main plays" This reverts commit 9cae59be51e8d2d798830042a5fd448a4aa5e7dc. Reason for revert: This patch was found to introduce issues with fluentd customisation. The underlying issue is not currently fully understood, but could be a sign of other obscure issues. Change-Id: Ia4859c23d85699621a3b734d6cedb70225576dfc Closes-Bug: #1906288	2020-12-14 10:36:55 +00:00
Radosław Piliszek	9cae59be51	Performance: Use import_tasks in the main plays Main plays are action-redirect-stubs, ideal for import_tasks. This avoids 'include' penalty and makes logs/ara look nicer. Fixes haproxy and rabbitmq not to check the host group as well. Change-Id: I46136fc40b815e341befff80b54a91ef431eabc0 Partially-Implements: blueprint performance-improvements	2020-10-27 19:09:32 +01:00
Radosław Piliszek	3411b9e420	Performance: optimize genconfig Config plays do not need to check containers. This avoids skipping tasks during the genconfig action. Ironic and Glance rolling upgrades are handled specially. Swift and Bifrost do not use the handlers at all. Partially-Implements: blueprint performance-improvements Change-Id: I140bf71d62e8f0932c96270d1f08940a5ba4542a	2020-10-12 19:30:06 +02:00
Zuul	6c5e9321e4	Merge "Allow to skip and unset sysctl vars"	2020-10-08 10:21:31 +00:00
Zuul	21a96db1be	Merge "Add support for changing sysctl.conf path"	2020-10-07 16:33:31 +00:00
Michal Nasiadka	c52a89ae04	Use Docker healthchecks for core services This change enables the use of Docker healthchecks for core OpenStack services. Also check-failures.sh has been updated to treat containers with unhealthy status as failed. Implements: blueprint container-health-check Change-Id: I79c6b11511ce8af70f77e2f6a490b59b477fefbb	2020-10-05 08:35:47 +00:00
Zuul	ba933f16e9	Merge "Support TLS encryption of RabbitMQ client-server traffic"	2020-09-29 11:31:03 +00:00
Pierre Riteau	c81772024c	Reduce the use of SQLAlchemy connection pooling When the internal VIP is moved in the event of a failure of the active controller, OpenStack services can become unresponsive as they try to talk with MariaDB using connections from the SQLAlchemy pool. It has been argued that OpenStack doesn't really need to use connection pooling with MariaDB [1]. This commit reduces the use of connection pooling via two configuration options: - max_pool_size is set to 1 to allow only a single connection in the pool (it is not possible to disable connection pooling entirely via oslo.db, and max_pool_size = 0 means unlimited pool size) - lower connection_recycle_time from the default of one hour to 10 seconds, which means the single connection in the pool will be recreated regularly These settings have shown better reactivity of the system in the event of a failover. [1] http://lists.openstack.org/pipermail/openstack-dev/2015-April/061808.html Change-Id: Ib6a62d4428db9b95569314084090472870417f3d Closes-Bug: #1896635	2020-09-22 17:54:45 +02:00
Radosław Piliszek	bce266201b	Allow to skip and unset sysctl vars via KOLLA_SKIP and KOLLA_UNSET Change-Id: I7d9af21c2dd8c303066eb1ee4dff7a72bca24283 Related-Bug: #1837551	2020-09-21 13:13:58 +02:00
Radosław Piliszek	6be51fa67a	Add support for changing sysctl.conf path via kolla_sysctl_conf_path Change-Id: I09b20fa008a7fecedcb599b4792f24215179b853	2020-09-21 11:47:05 +02:00
wu.chunyang	88de8feb7b	replace internal with openstack_interface replace harcode 'internal' with {{ openstack_interface }} Change-Id: I885622967ffde2a7a1a08fedbde2eb0e4e330e22	2020-09-18 21:42:52 +08:00
Zuul	91f5861769	Merge "Support neutron_sriov_physnet_mappings to support multiple devices"	2020-09-17 16:53:26 +00:00
Bharat Kunwar	c24a280bee	Support neutron_sriov_physnet_mappings to support multiple devices Change-Id: Ifcedcc72307732393a92a702a7567addc043b5b2	2020-09-17 13:26:30 +00:00
Mark Goddard	761ea9a333	Support TLS encryption of RabbitMQ client-server traffic This change adds support for encryption of communication between OpenStack services and RabbitMQ. Server certificates are supported, but currently client certificates are not. The kolla-ansible certificates command has been updated to support generating certificates for RabbitMQ for development and testing. RabbitMQ TLS is enabled in the all-in-one source CI jobs, or when The Zuul 'tls_enabled' variable is true. Change-Id: I4f1d04150fb2b5af085b762890092f87ae6076b5 Implements: blueprint message-queue-ssl-support	2020-09-17 12:05:44 +01:00
Mark Goddard	3c02c966cb	Performance: remove one include_tasks in nova-cell Including tasks has a performance penalty when compared with importing tasks. The nova-cell role uses include_tasks twice when generating certificates and keys for libvirt TLS. While a dynamic include makes sense here for a non-default feature, we can use one include rather than two with the same effect. Since this task runs against compute nodes the overhead is significant. See [1] for benchmarks of include_tasks and import_tasks. [1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/include-and-import.md Partially-Implements: blueprint performance-improvements Change-Id: Ic687d2f7d4625aede386e576ebb174da72142756	2020-08-28 16:16:56 +00:00
Mark Goddard	b685ac44e0	Performance: replace unconditional include_tasks with import_tasks Including tasks has a performance penalty when compared with importing tasks. If the include has a condition associated with it, then the overhead of the include may be lower than the overhead of skipping all imported tasks. For unconditionally included tasks, switching to import_tasks provides a clear benefit. Benchmarking of include vs. import is available at [1]. This change switches from include_tasks to import_tasks where there is no condition applied to the include. [1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/include-and-import.md#task-include-and-import Partially-Implements: blueprint performance-improvements Change-Id: Ia45af4a198e422773d9f009c7f7b2e32ce9e3b97	2020-08-28 16:12:03 +00:00
Zuul	fa48cc7eaf	Merge "Use iSCSI multipath for libvirt"	2020-08-26 13:57:47 +00:00
Zuul	42f57166d4	Merge "replace os-tenant-name with os-project-name in openstackclient"	2020-08-24 10:27:40 +00:00
wu.chunyang	817cf80702	replace os-tenant-name with os-project-name in openstackclient openstackclient doesn't supoort os-temant-name parameter use os-project-name instead of os-tenant-name https://docs.openstack.org/python-openstackclient/ussuri/cli/man/openstack.html Change-Id: Ibf17424c49118b4c3b7e621e04b43c8cdcf308a4	2020-08-22 23:02:30 +08:00
Zuul	e53dae8eff	Merge "Add cinder auth config to nova-cell nova.conf.j2"	2020-08-21 15:45:02 +00:00
Jegor van Opdorp	de16013bd6	Add cinder auth config to nova-cell nova.conf.j2 Fixes an issue during deleting evacuated instances with encrypted block devices. Change-Id: I9b9b689ef7e1e41b597e2c5f6b96f3ed011193c5 Closes-Bug: 1891462 Related-Bug: 1850279	2020-08-19 07:25:20 +00:00
Florian LEDUC	56710de59d	Use iSCSI multipath for libvirt * Multipath daemon allows to reach block devices via multiple paths for better resiliency and performance. Multipathd periodically checks the failed iscsi paths and maintains a list of valid paths. Libvirt can use more than one iSCSI path when option volume_use_multipath is set and when multipathd enabled. Change-Id: I54629656803c4989f7673e8c69d2a820609b5960 Implements: blueprint nova-libvirt-multipath-iscsi	2020-08-19 07:24:51 +00:00
Rafael Weingärtner	f425c0678f	Standardize use and construction of endpoint URLs The goal for this push request is to normalize the construction and use of internal, external, and admin URLs. While extending Kolla-ansible to enable a more flexible method to manage external URLs, we noticed that the same URL was constructed multiple times in different parts of the code. This can make it difficult for people that want to work with these URLs and create inconsistencies in a large code base with time. Therefore, we are proposing here the use of "single Kolla-ansible variable" per endpoint URL, which facilitates for people that are interested in overriding/extending these URLs. As an example, we extended Kolla-ansible to facilitate the "override" of public (external) URLs with the following standard "<component/serviceName>.<companyBaseUrl>". Therefore, the "NAT/redirect" in the SSL termination system (HAproxy, HTTPD or some other) is done via the service name, and not by the port. This allows operators to easily and automatically create more friendly URL names. To develop this feature, we first applied this patch that we are sending now to the community. We did that to reduce the surface of changes in Kolla-ansible. Another example is the integration of Kolla-ansible and Consul, which we also implemented internally, and also requires URLs changes. Therefore, this PR is essential to reduce code duplicity, and to facility users/developers to work/customize the services URLs. Change-Id: I73d483e01476e779a5155b2e18dd5ea25f514e93 Signed-off-by: Rafael Weingärtner <rafael@apache.org>	2020-08-19 07:22:17 +00:00
Bharat Kunwar	4809462f4e	Deploy neutron-mlnx-agent and neutron-eswitchd containers Change-Id: I173669bdf92b1f2ea98907ba16808ca3c914944c	2020-08-13 23:33:57 +01:00
Zuul	516658f489	Merge "Mount /etc/timezone based on host OS"	2020-08-12 22:09:19 +00:00
Mark Goddard	146b00efa7	Mount /etc/timezone based on host OS Previously we mounted /etc/timezone if the kolla_base_distro is debian or ubuntu. This would fail prechecks if debian or ubuntu images were deployed on CentOS. While this is not a supported combination, for correctness we should fix the condition to reference the host OS rather than the container OS, since that is where the /etc/timezone file is located. Change-Id: Ifc252ae793e6974356fcdca810b373f362d24ba5 Closes-Bug: #1882553	2020-08-10 10:14:18 +01:00
Mark Goddard	97e26b49cd	Fix Barbican client (Castellan) with TLS (part 2) This patch is a continuation of I6a174468bd91d214c08477b93c88032a45c137be for the nova-cell role, which was missed. The Castellan (Barbican client) has different parameters to control the used CA file. This patch uses them. Moreover, this aligns Barbican with other services by defaulting its client config to the internal endpoint. See also [1]. [1] https://bugs.launchpad.net/castellan/+bug/1876102 Closes-Bug: #1886615 Change-Id: I056f3eebcf87bcbaaf89fdd0dc1f46d143db7785	2020-08-07 14:16:04 +01:00
Mark Goddard	9702d4c3c3	Performance: use import_tasks for check-containers.yml Including tasks has a performance penalty when compared with importing tasks. If the include has a condition associated with it, then the overhead of the include may be lower than the overhead of skipping all imported tasks. In the case of the check-containers.yml include, the included file only has a single task, so the overhead of skipping this task will not be greater than the overhead of the task import. It therefore makes sense to switch to use import_tasks there. Partially-Implements: blueprint performance-improvements Change-Id: I65d911670649960708b9f6a4c110d1a7df1ad8f7	2020-07-28 12:10:59 +01:00
Zuul	b0407ffb17	Merge "Make /dev/kvm permissions handling more robust"	2020-07-22 12:32:40 +00:00
Radosław Piliszek	202365e702	Make /dev/kvm permissions handling more robust This makes use of udev rules to make it smarter and override host-level packages settings. Additionally, this masks Ubuntu-only service that is another pain point in terms of /dev/kvm permissions. Fingers crossed for no further surprises. Change-Id: I61235b51e2e1325b8a9b4f85bf634f663c7ec3cc Closes-bug: #1681461	2020-07-17 17:51:18 +00:00
Zuul	9a8341c2a7	Merge "Performance: Run common role in a separate play"	2020-07-17 15:43:22 +00:00
Mark Goddard	2f91be9f39	Load br_netfilter module in nova-cell role The nova-cell role sets the following sysctls on compute hosts, which require the br_netfilter kernel module to be loaded: net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables If it is not loaded, then we see the following errors: Failed to reload sysctl: sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory Loading the br_netfilter module resolves this issue. Typically we do not see this since installing Docker and configuring it to manage iptables rules causes the br_netfilter module to be loaded. There are good reasons [1] to disable Docker's iptables management however, in which case we are likely to hit this issue. This change loads the br_netfilter module in the nova-cell role for compute hosts. [1] https://bugs.launchpad.net/kolla-ansible/+bug/1849275 Co-Authored-By: Dincer Celik <hello@dincercelik.com> Change-Id: Id52668ba8dab460ad4c33fad430fc8611e70825e	2020-07-08 11:13:39 +01:00
Mark Goddard	56ae2db7ac	Performance: Run common role in a separate play The common role was previously added as a dependency to all other roles. It would set a fact after running on a host to avoid running twice. This had the nice effect that deploying any service would automatically pull in the common services for that host. When using tags, any services with matching tags would also run the common role. This could be both surprising and sometimes useful. When using Ansible at large scale, there is a penalty associated with executing a task against a large number of hosts, even if it is skipped. The common role introduces some overhead, just in determining that it has already run. This change extracts the common role into a separate play, and removes the dependency on it from all other roles. New groups have been added for cron, fluentd, and kolla-toolbox, similar to other services. This changes the behaviour in the following ways: * The common role is now run for all hosts at the beginning, rather than prior to their first enabled service * Hosts must be in the necessary group for each of the common services in order to have that service deployed. This is mostly to avoid deploying on localhost or the deployment host * If tags are specified for another service e.g. nova, the common role will not automatically run for matching hosts. The common tag must be specified explicitly The last of these is probably the largest behaviour change. While it would be possible to determine which hosts should automatically run the common role, it would be quite complex, and would introduce some overhead that would probably negate the benefit of splitting out the common role. Partially-Implements: blueprint performance-improvements Change-Id: I6a4676bf6efeebc61383ec7a406db07c7a868b2a	2020-07-07 15:00:47 +00:00
Pierre Riteau	c40e806587	Remove policy file from nova-conductor config.json template Change I810aad7d49db3f5a7fd9a2f0f746fd912fe03917 for supporting multiple Nova cells updated the list of containers that require a policy file to only include nova-api, nova-compute, and nova-compute-ironic. The nova-conductor config.json template was left unchanged and fails to copy the nova policy file into its container. This can be seen on a fresh deployment, but might be missed on an upgrade if an older policy file is still available in /etc/kolla/nova-conductor. This commit removes the nova_policy_file block from the nova-conductor config.json template, as it shouldn't be required. Backport: ussuri, train Change-Id: I17256b182d207aeba3f92c65a6d7cf3611180558 Closes-Bug: #1886170	2020-07-03 12:52:57 +02:00
wu.chunyang	a9c94aee39	nova-cell role clone failed when enable kolla_dev_mod, nova-cell role clones code failed, because we use nova-cell repository which is not exists. in fact, nova-cell role should use nova repository too Change-Id: I7fa62726d0d5b0aeb3bd5fa06dc0e59667f94fa0	2020-06-22 22:12:11 +08:00
gugug	f220970d46	Clean up the unnecessary "" for include_tasks The double quotation is not necessary for include_tasks, this ps to cleanup it. Change-Id: I0701035d185fdf19286cced7fe51fc277511e4c1	2020-06-16 23:36:42 +08:00
Zuul	e74cada7c1	Merge "permission denied when enable_kolla_dev_mod"	2020-06-10 02:32:45 +00:00
Christian Berendt	60e03d7bf3	Remove XenAPI integration Change-Id: Iea3f4f3d2e5c6040c1e0bc7bfae8719cc7d8ac55	2020-06-09 13:56:17 +02:00
wu.chunyang	3e9a648601	permission denied when enable_kolla_dev_mod non-root user has no permission to create directory under /opt directory. use "become: true" to resolve it. Change-Id: I155efc4b1e0691da0aaf6ef19ca709e9dc2d9168	2020-06-07 19:36:42 +08:00
Zuul	c07ee9af4f	Merge "Configure RabbitMQ user tags in nova-cell role"	2020-05-17 12:26:56 +00:00
Jeffrey Zhang	869e3f21c2	Configure RabbitMQ user tags in nova-cell role The RabbitMQ 'openstack' user has the 'administrator' tag assigned via the RabbitMQ definitions.json file. Since the Train release, the nova-cell role also configures the RabbitMQ user, but omits the tag. This causes the tag to be removed from the user, which prevents it from accessing the management UI and API. This change adds support for configuring user tags to the service-rabbitmq role, and sets the administrator tag by default. Change-Id: I7a5d6fe324dd133e0929804d431583e5b5c1853d Closes-Bug: #1875786	2020-05-15 16:02:46 +01:00
Radosław Piliszek	93c9ad892c	Make nova perms consistent between applications Nova cells support introduced a slight regression that triggers odd behaviour when we tried switching to Apache (httpd) [1]. Bootstrap no longer applied permissions recursively to all log files, creating a discrepancy between normal and bootstrap runs and also Nova and other services such as Cinder (regarding bootstrap logging). This patch fixes it. Backport to Train. Not creating reno nor a bug record because it does not affect any current standard usage in any currently known way. Note this only really hides (standardizes?) the global issue that we don't control file permissions on newly created files too well. [1] https://review.opendev.org/724793 Change-Id: I35e9924ccede5edd2e1307043379aba944725143 Needed-By: https://review.opendev.org/724793	2020-05-06 18:36:10 +00:00
Zuul	76d69cae0e	Merge "Fix nova cell message queue URL with separate notification queue"	2020-04-26 16:46:35 +00:00
Zuul	4d49397d72	Merge "nova: Add debug logging to libvirtd.conf"	2020-04-26 15:46:29 +00:00
Zuul	7a193d1f06	Merge "Ansible lint: lines longer than 160 chars"	2020-04-17 09:29:00 +00:00
Zuul	87984f5425	Merge "Add Ansible group check to prechecks"	2020-04-16 15:33:46 +00:00
Zuul	2e2672e753	Merge "Fix nova compute addition with limit"	2020-04-16 15:33:44 +00:00
Zuul	7f42813159	Merge "Refactor copy certificates task"	2020-04-16 14:03:37 +00:00
Michal Nasiadka	d403690b88	Ansible lint: lines longer than 160 chars Change-Id: I500cc8800c412bc0e95edb15babad5c1189e6ee4	2020-04-16 15:59:06 +02:00
Mark Goddard	e8ad5f37d4	Fix nova cell message queue URL with separate notification queue If using a separate message queue for nova notifications, i.e. nova_cell_notify_transport_url is different from nova_cell_rpc_transport_url, then Kolla Ansible will unnecessarily update the cell. This should not cause any issues since the URL is taken from nova.conf. This change fixes the comparison to use the correct URL. Change-Id: I5f0e30957bfd70295f2c22c86349ebbb4c1fb155 Closes-Bug: #1873255	2020-04-16 12:32:40 +01:00

1 2

91 Commits