kolla-ansible

Author	SHA1	Message	Date
Zuul	87e671c6b4	Merge "Add a job that only deploys updated containers"	2019-09-30 13:19:28 +00:00
Doug Szumski	0d7a34e8c7	Remove Nova legacy upgrade The rolling upgrade has been the default since Stein. The legacy upgrade has been removed because it doesn't follow the upgrade guide [1]. [1] https://docs.openstack.org/nova/latest/user/upgrade.html Change-Id: I2aa879699cb4e9955bf5c38053eada5a53fb6211	2019-09-26 18:04:50 +01:00
Kris Lindgren	2fe0d98ebb	Add a job that only deploys updated containers Sometimes as cloud admins, we want to only update code that is running in a cloud. But we dont need to do anything else. Make an action in kolla-ansible that allows us to do that. Change-Id: I904f595c69f7276e71692696471e32fd1f88e6e8 Implements: blueprint deploy-containers-action	2019-09-26 17:51:14 +01:00
Zuul	340b6d9456	Merge "Add support for libvirt+tls"	2019-09-26 09:19:33 +00:00
Kris Lindgren	f8cfccb99e	Add support for libvirt+tls To securely support live migration between computenodes we should enable tls, with cert auth, instead of TCP with no auth support. Implements: blueprint libvirt-tls Change-Id: I22ea6233933c840b853fdcc8e03400b2bf577271	2019-09-19 15:32:41 +01:00
Zuul	a21b9b5430	Merge "Refactor service, endpoint and user registration"	2019-09-18 17:34:15 +00:00
Mark Goddard	3522d235bd	Refactor service, endpoint and user registration Use upstream Ansible modules for registration of services, endpoints, users, projects, roles, and role grants. Change-Id: I7c9138d422cc91c177fd8992347176bb54156b5a	2019-09-17 10:13:56 -07:00
Yang Youseok	f1f12d70a3	Ignore create_cells and discover_computes when nova-api is disabled When nova-api group have no hosts, we don't have to run create_cells and discover_computes. Add conditional blocks to prevent to run them. Change-Id: Ia1ba058c1b74b06b678f45544883e567e2b4eb55 Closes-Bug: #1843235	2019-09-11 17:51:27 +09:00
Doug Szumski	7b636033ee	Fix Nova cell search The output from `nova-manage cell_v2 list_cells --verbose` contains an extra column, stating whether the cell is enabled or not. This means that the regex never matches, so existing_cells is always empty. This fix updates the regex by adding a match group for this field which may be used in a later change. Unfortuately the CLI doesn't output in JSON format, which would make this a lot less messy. Closes-Bug: #1842460 Change-Id: Ib6400b33785f3ef674bffc9329feb3e33bd3f9a3	2019-09-03 18:12:14 +01:00
Scott Solkhon	09e02ef8f1	Support configuration of trusted CA certificate file This commit adds the functionality for an operator to specify their own trusted CA certificate file for interacting with the Keystone API. Implements: blueprint support-trusted-ca-certificate-file Change-Id: I84f9897cc8e107658701fb309ec318c0f805883b	2019-08-16 12:47:42 +00:00
Zuul	daba362f43	Merge "Handle more return codes from nova-status upgrade check"	2019-08-05 08:42:10 +00:00
Radosław Piliszek	6a737b1968	Fix handling of docker restart policy Docker has no restart policy named 'never'. It has 'no'. This has bitten us already (see [1]) and might bite us again whenever we want to change the restart policy to 'no'. This patch makes our docker integration honor all valid restart policies and only valid restart policies. All relevant docker restart policy usages are patched as well. I added some FIXMEs around which are relevant to kolla-ansible docker integration. They are not fixed in here to not alter behavior. [1] https://review.opendev.org/667363 Change-Id: I1c9764fb9bbda08a71186091aced67433ad4e3d6 Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>	2019-07-18 13:39:06 +00:00
Mark Goddard	d5e5e885d1	During deploy, always sync DB A common class of problems goes like this: * kolla-ansible deploy * Hit a problem, often in ansible/roles//tasks/bootstrap.yml Re-run kolla-ansible deploy * Service fails to start This happens because the DB is created during the first run, but for some reason we fail before performing the DB sync. This means that on the second run we don't include ansible/roles/*/tasks/bootstrap_service.yml because the DB already exists, and therefore still don't perform the DB sync. However this time, the command may complete without apparent error. We should be less careful about when we perform the DB sync, and do it whenever it is necessary. There is an argument for not doing the sync during a 'reconfigure' command, although we will not change that here. This change only always performs the DB sync during 'deploy' and 'reconfigure' commands. Change-Id: I82d30f3fcf325a3fdff3c59f19a1f88055b566cc Closes-Bug: #1823766 Closes-Bug: #1797814	2019-07-12 08:56:54 +00:00
Mark Goddard	5be093ac5a	Fix nova deploy with Ansible<2.8 Due to a bug in ansible, kolla-ansible deploy currently fails in nova with the following error when used with ansible earlier than 2.8: TASK [nova : Waiting for nova-compute services to register themselves] ********* task path: /home/zuul/src/opendev.org/openstack/kolla-ansible/ansible/roles/nova/tasks/discover_computes.yml:30 fatal: [primary]: FAILED! => { "failed": true, "msg": "The field 'vars' has an invalid value, which includes an undefined variable. The error was: 'nova_compute_services' is undefined\n\nThe error appears to have been in '/home/zuul/src/opendev.org/openstack/kolla-ansible/ansible/roles/nova/tasks/discover_computes.yml': line 30, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Waiting for nova-compute services to register themselves\n ^ here\n" } Example: http://logs.openstack.org/00/669700/1/check/kolla-ansible-centos-source/81b65b9/primary/logs/ansible/deploy This was caused by https://review.opendev.org/#/q/I2915e2610e5c0b8d67412e7ec77f7575b8fe9921, which hits upon an ansible bug described here: https://github.com/markgoddard/ansible-experiments/tree/master/05-referencing-registered-var-do-until. We can work around this by not using an intermediary variable. Change-Id: I58f8fd0a6e82cb614e02fef6e5b271af1d1ce9af Closes-Bug: #1835817	2019-07-08 19:58:51 +00:00
Mariusz	c68ed4dd51	Handle more return codes from nova-status upgrade check In a single controller scenario, the "Upgrade status check result" does nothing because the previous task can only succeed when `nova-status upgrade check` returns code 0. This change allows this command to fail, so that the value of returned code stored in `nova_upgrade_check_stdout` can then be analysed. This change also allows for warnings (rc 1) to pass. Closes-Bug: 1834647 Change-Id: I6f5e37832f43f23604920b9d890cc505ca924ff9	2019-07-08 14:13:27 +01:00
Zuul	8daad1abcf	Merge "Wait for all compute services before cell discovery"	2019-07-05 10:31:29 +00:00
Mark Goddard	c38dd76711	Wait for all compute services before cell discovery There is a race condition during nova deploy since we wait for at least one compute service to register itself before performing cells v2 host discovery. It's quite possible that other compute nodes will not yet have registered and will therefore not be discovered. This leaves them not mapped into a cell, and results in the following error if the scheduler picks one when booting an instance: Host 'xyz' is not mapped to any cell The problem has been exacerbated by merging a fix [1][2] for a nova race condition, which disabled the dynamic periodic discovery mechanism in the nova scheduler. This change fixes the issue by waiting for all expected compute services to register themselves before performing host discovery. This includes both virtualised compute services and bare metal compute services. [1] https://bugs.launchpad.net/kolla-ansible/+bug/1832987 [2] https://review.opendev.org/665554 Change-Id: I2915e2610e5c0b8d67412e7ec77f7575b8fe9921 Closes-Bug: #1835002	2019-07-04 13:03:12 +00:00
Mark Goddard	de00bf491d	Simplify handler conditionals Currently, we have a lot of logic for checking if a handler should run, depending on whether config files have changed and whether the container configuration has changed. As rm_work pointed out during the recent haproxy refactor, these conditionals are typically unnecessary - we can rely on Ansible's handler notification system to only trigger handlers when they need to run. This removes a lot of error prone code. This patch removes conditional handler logic for all services. It is important to ensure that we no longer trigger handlers when unnecessary, because without these checks in place it will trigger a restart of the containers. Implements: blueprint simplify-handlers Change-Id: I4f1aa03e9a9faaf8aecd556dfeafdb834042e4cd	2019-06-27 15:57:19 +00:00
Zuul	651b983bdb	Merge "Restart all nova services after upgrade"	2019-06-27 13:39:12 +00:00
Mark Goddard	e6d2b92200	Restart all nova services after upgrade During an upgrade, nova pins the version of RPC calls to the minimum seen across all services. This ensures that old services do not receive data they cannot handle. After the upgrade is complete, all nova services are supposed to be reloaded via SIGHUP to cause them to check again the RPC versions of services and use the new latest version which should now be supported by all running services. Due to a bug [1] in oslo.service, sending services SIGHUP is currently broken. We replaced the HUP with a restart for the nova_compute container for bug 1821362, but not other nova services. It seems we need to restart all nova services to allow the RPC version pin to be removed. Testing in a Queens to Rocky upgrade, we find the following in the logs: Automatically selected compute RPC version 5.0 from minimum service version 30 However, the service version in Rocky is 35. There is a second issue in that it takes some time for the upgraded services to update the nova services database table with their new version. We need to wait until all nova-compute services have done this before the restart is performed, otherwise the RPC version cap will remain in place. There is currently no interface in nova available for checking these versions [2], so as a workaround we use a configurable delay with a default duration of 30 seconds. Testing showed it takes about 10 seconds for the version to be updated, so this gives us some headroom. This change restarts all nova services after an upgrade, after a 30 second delay. [1] https://bugs.launchpad.net/oslo.service/+bug/1715374 [2] https://bugs.launchpad.net/nova/+bug/1833542 Change-Id: Ia6fc9011ee6f5461f40a1307b72709d769814a79 Closes-Bug: #1833069 Related-Bug: #1833542	2019-06-27 09:36:20 +00:00
Radosław Piliszek	cc058f4586	Make nova external ceph key extraction tasks non-changing They are used only to obtain keys for the next task. Change-Id: I2fac22af4710b70e4df8e3a272bcfb6cc8b8532e Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>	2019-06-26 14:21:20 +02:00
Zuul	03976399f0	Merge "Avoid parallel discover_hosts (nova-related race condition)"	2019-06-24 13:08:23 +00:00
Zuul	6cae4dedfe	Merge "Remove nova-consoleauth"	2019-06-17 16:28:45 +00:00
Radosław Piliszek	ce680bcfe2	Avoid parallel discover_hosts (nova-related race condition) In a rare event both kolla-ansible and nova-scheduler try to do the mapping at the same time and one of them fails. Since kolla-ansible runs host discovery on each deployment, there is no need to change the default of no periodic host discovery. I added some notes for future. They are not critical. I made the decision explicit in the comments. I changed the task name to satisfy recommendations. I removed the variable because it is not used (to avoid future doubts). Closes-Bug: #1832987 Change-Id: I3128472f028a2dbd7ace02abc179a9629ad74ceb Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>	2019-06-16 20:42:39 +02:00
Jeffrey Zhang	4e032923c0	Remove nova-consoleauth The nova-consoleauth service was deprecated during the Rocky release [1] and has not been necessary since unless you're using cells v1. As Kolla has never supported cells v1, which is finally being removed during Train [2], we can get ahead of the curve and stop deploying nova-consoleauth immediately. [1] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html [2] https://blueprints.launchpad.net/nova/+spec/remove-cells-v1/ Change-Id: I099080979f5497537e390f531005a517ab12aa7a	2019-06-16 16:39:07 +08:00
Zuul	ee895413eb	Merge "Stop duplicating Nova cells"	2019-06-13 18:56:00 +00:00
Zuul	888e50f01b	Merge "Use become for all docker tasks"	2019-06-07 10:47:23 +00:00
Zuul	0a1ad98105	Merge "Support multi-region discovery of Nova cells"	2019-06-07 09:08:04 +00:00
Zuul	01f0f2387d	Merge "Hide logs when looping over passwords"	2019-06-07 08:53:40 +00:00
Mark Goddard	b123bf6621	Use become for all docker tasks Many tasks that use Docker have become specified already, but not all. This change ensures all tasks that use the following modules have become: * kolla_docker * kolla_ceph_keyring * kolla_toolbox * kolla_container_facts It also adds become for 'command' tasks that use docker CLI. Change-Id: I4a5ebcedaccb9261dbc958ec67e8077d7980e496	2019-06-06 19:04:58 +01:00
Pierre Riteau	19b8dbe460	Stop duplicating Nova cells Check if a base Nova cell already exists before calling `nova-manage cell_v2 create_cell`, which would otherwise create a duplicate cell when the transport URL or database connection change. If a base cell already exists but the connection values have changed, we now call `nova-manage cell_v2 update_cell` instead. This is only possible if a duplicate cell has not yet been created. If one already exists, we print a warning inviting the operator to perform a manual cleanup. We don't use a hard fail to avoid an abrupt change of behavior if this is backported to stable branches. Change-Id: I7841ce0cff08e315fd7761d84e1e681b1a00d43e Closes-Bug: #1734872	2019-06-06 18:10:06 +01:00
Jason	30c619d1bc	Hide logs when looping over passwords When ansible goes in to a loop, by default it prints all the keys for the item it is looping over. Some roles, when setting up the databases, iterate over an object that includes the database password. Override the loop label to hide everything but the database name. Change-Id: I336a81a5ecd824ace7d40e9a35942a1c853554cd	2019-06-05 08:09:51 -05:00
Jason	328e14253d	Support multi-region discovery of Nova cells In a multi-region environment, each region is being deployed separately. Cell discovery, however, would sometimes fail due to it picking a region different than the one being deployed. Most likely, an internal endpoint for region A will not be visible from region B. Furthermore, it is not very useful to discover hosts on a region you're not modifying. This changes the check to only run against nova compute services located in the region being deployed. Change-Id: I21eb1164c2f67098b81edbd5cc106472663b92cb	2019-06-05 08:07:13 -05:00
ZhongShengping	41f3a817ac	Move to opendev 1.Use opendev.org instead of git.openstack.org. 2.Use review.opendev.org instead of review.openstack.org. You can see the discussion below: http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003603.html Change-Id: Ice4509204df788a1a44a06fb89fb44cfe6b54b94	2019-04-23 13:28:39 +08:00
Mark Goddard	a4bb8567da	Fix up config file permissions on the host Several config file permissions are incorrect on the host. In general, files should be 0660, and directories and executables 0770. Change-Id: Id276ac1864f280554e98b937f2845bb424d521de Closes-Bug: #1821579	2019-04-02 17:23:31 +01:00
Zuul	ed5588c934	Merge "Don't pull images during upgrade"	2019-03-28 12:41:22 +00:00
Zuul	4a5d8b0d05	Merge "Add mising handlers for external Ceph."	2019-03-26 06:17:09 +00:00
Mark Goddard	192dcd1e1b	Fix booting instances after nova-compute upgrade After upgrading from Rocky to Stein, nova-compute services fail to start new instances with the following error message: Failed to allocate the network(s), not rescheduling. Looking in the nova-compute logs, we also see this: Neutron Reported failure on event network-vif-plugged-60c05a0d-8758-44c9-81e4-754551567be5 for instance 32c493c4-d88c-4f14-98db-c7af64bf3324: NovaException: In shutdown, no new events can be scheduled During the upgrade process, we send nova containers a SIGHUP to cause them to reload their object version state. Speaking to the nova team in IRC, there is a known issue with this, caused by oslo.service performing a full shutdown in response to a SIGHUP, which breaks nova-compute. There is a patch [1] in review to address this. The workaround employed here is to restart the nova compute service. [1] https://review.openstack.org/#/c/641907 Change-Id: Ia4fcc558a3f62ced2d629d7a22d0bc1eb6b879f1 Closes-Bug: #1821362	2019-03-22 16:26:36 +00:00
Scott Solkhon	c70d806666	Add mising handlers for external Ceph. When Nova, Glance, or Cinder are deployed alongside an external Ceph deployment handlers will fail to trigger if keyring files are updated, which results in the containers not being restarted. This change adds the missing 'when' conditions for nova-libvirt, nova-compute, cinder-volume, cinder-backup, and glance-api containers. Change-Id: I8e183aac9a72e7a7210f7edc7cdcbaedd4fbcaa9	2019-03-22 11:20:34 +00:00
Mark Goddard	58d6dc3bcf	Don't pull images during upgrade When adding the rolling upgrade support, some upgrade procedures were modified to pull images explicitly. This is done inconsistently between services, and is a change in behaviour from Rocky and earlier releases. This change removes all image pulling from upgrade tasks. Change-Id: Id0fed17714235e1daed60b83b1f30620f097eb97	2019-03-20 18:51:45 +00:00
Eduardo Gonzalez	2fc6d4cfc5	Split placement from nova Depends-On: https://review.openstack.org/#/c/642958 Depends-On: https://review.openstack.org/642984 Change-Id: If795a9eb3ec92f75867ce3f755d6b832eba31af9	2019-03-15 15:19:54 +00:00
wu.chunyang	7d9cb44d1f	Restart containers when ceph.conf changed When ceph.conf changed, we need restart some containers. Change-Id: Iddeaf9dd4f288165fcef288e5384d79b61a0910b Closes-Bug: #1810010	2019-03-02 16:22:24 +08:00
Jim Rollenhagen	51c9e1b633	Allow nova services to use independent hostnames This allows nova service endpoints to use custom hostnames, and adds the following variables: * nova_internal_fqdn * nova_external_fqdn * placement_internal_fqdn * placement_external_fqdn * nova_novncproxy_fqdn * nova_spicehtml5proxy_fqdn * nova_serialproxy_fqdn These default to the old values of kolla_internal_fqdn or kolla_external_fqdn. This also adds the following variables: * nova_api_listen_port * nova_metadata_listen_port * nova_novncproxy_listen_port * nova_spicehtml5proxy_listen_port * nova_serialproxy_listen_port * placement_api_listen_port These default to <service>_port, e.g. nova_api_port, for backward compatibility. These options allow the user to differentiate between the port the service listens on, and the port the service is reachable on. This is useful for external load balancers which live on the same host as the service itself. Change-Id: I7bcce56a2138eeadcabac79dd07c8dba1c5af644 Implements: blueprint service-hostnames	2019-02-08 10:25:02 -05:00
Mark Goddard	365bb5177d	Create cells before starting nova services Nova services may reasonably expect cell databases to exist when they start. The current cell setup tasks in kolla run after the nova containers have started, meaning that cells may or may not exist in the database when they start, depending on timing. In particular, we are seeing issues in kolla CI currently with jobs timing out waiting for nova compute services to start. The following error is seen in the nova logs of these jobs, which may or may not be relevant: No cells are configured, unable to continue This change creates the cell0 and cell1 databases prior to starting nova services. In order to do this, we must create new containers in which to run the nova-manage commands, because the nova-api container may not yet exist. This required adding support to the kolla_docker module for specifying a command for the container to run that overrides the image's command. We also add the standard output and error to the module's result when a non-detached container is run. A secondary benefit of this is that the output of bootstrap containers is now displayed in the Ansible output if the bootstrapping command fails, which will help with debugging. Change-Id: I2c1e991064f9f588f398ccbabda94f69dc285e61 Closes-Bug: #1808575	2018-12-14 19:26:42 +00:00
Paul Bourke	a16d78711f	Allow operators to customise Nova vendor info Nova allows customisation of various metadata passed through to VMs via a 'release' file[0]. Allow operators to make use of this. [0] https://github.com/openstack/nova/blob/master/etc/nova/release.sample Change-Id: I71569314c8e64320f8ffad79b9273f4d6d903bb6	2018-11-30 09:48:28 +00:00
Eduardo Gonzalez	1a682fab28	Support stop specific containers With this change, an operator may be able to stop a service container without stopping all services in a host. This change is the starting point to start fast-forward upgrades support. In next changes new flags will be introducced to disable stop dataplane services during upgrades. Change-Id: Ifde7a39d7d8596ef0d7405ecf1ac1d49a459d9ef Implements: blueprint support-stop-containers	2018-11-26 08:07:01 +00:00
Christian Berendt	03788e17d4	Set "no_log" for "databases user and setting permissions" tasks At the moment the "databases user and setting permissions" task for designate and nova leaks the database_password because of the use of with_items: ---snip--- TASK [nova : Creating Nova databases user and setting permissions] ********************************************************* ok: [x -> y] => (item={u'database_password': u'password', u'database_name': u'nova', u'database_username': u'nova'}) ok: [x -> y] => (item={u'database_password': u'password', u'database_name': u'nova_cell0', u'database_username': u'nova'}) ok: [x -> y] => (item={u'database_password': u'password', u'database_name': u'nova_api', u'database_username': u'nova_api'}) ---snap--- Change-Id: I141e4153223c8772c82a31d81e58057ce266c0b9 Co-authored-by: Bernd Müller <mueller@b1-systems.de>	2018-11-19 11:10:41 +00:00
Zuul	c0435b833a	Merge "Generate Ceph configuration during upgrade"	2018-10-26 06:33:36 +00:00
Christian Berendt	864e589803	nova: add support for a dedicated migration network Two new parameters (migration_interface, migration_interface_address) to make the use of a dedicated migration network possible. Change-Id: I723c9bea9cf1881e02ba39d5318c090960c22c47	2018-10-23 18:37:28 +02:00
Mark Goddard	242625dff4	Generate Ceph configuration during upgrade If upgrading the nova, cinder or manila services via 'kolla-ansible upgrade', the Ceph config files are not generated. Users will expect that these files are generated, to pull in any changes from their configuration or the base kolla configuration. This change moves Ceph tasks inside config.yml to ensure that they are performed during deploy, reconfigure and upgrade. This has been done for nova, cinder, gnocchi and manila - glance already does this. Change-Id: Ic75692c2bcba9b81dee922ff6fbbccd160e7fa19 Closes-Bug: #1794275	2018-10-10 10:48:55 +01:00

1 2 3 4 5 ...

289 Commits