When we were upgrading multiple nodes at the same time,
e.g. controllers, and a taks on one of the nodes failed, the other
nodes would keep upgrading. This is undersirable and can be fixed by
adding any_errors_fatal to the Ansible plays.
Change-Id: Iad2b5e32e955da41af4d2b8dd8ad8aa1eb5dffa9
Closes-Bug: #1804468
To continue the work that was done in
I711dbb00a9c34dbd96ef179ef41bff281b0001d1, we also need to skip the common
deploy tasks if --skip-deploy-identifier is passed by the operator.
When using --skip-deploy-identifier, the UpdateIdentifier is set to
None.
Ansible doesn't see None as "", so we really need to test if the
variable is defined or not. That patches changes the logic to test that.
We also support the case where the variable is set to "", and consider
is as empty which means we want to skip the deploy/updates.
It is also doing it for the update playbooks which includes tasks from
commont deploy.
It is not replicating the exact condition as in deploy_steps_playbook.
There is no need to also check if /var/lib/docker-container-startup-configs.json
file is here because it has been created during the initial deployment.
This fix the bug where --skip-deploy-identifier wasn't honored during
stack updates.
Co-Authored-By: Thomas Hervé <therve@redhat.com>
Co-Authored-By: Sam Doran <sdoran@redhat.com>
Change-Id: Ibab17dcaeebea65135fca4f40562109c90f36c27
Related-Bug: #1796924
container_cli will be used later by update, upgrade and post upgrade tasks.
This patch is separated from actual tasks, so we can quickly iterate in
multiple patches.
Change-Id: I1ed7dec0019113f1259bce986f354723237f6a25
We should pass in the common vars to all the common plays in
deploy-steps.j2 so that tasks will have them available. Some of these
parameter driven variables were never actually wired up, so they didn't
work to begin with (such as enable_puppet/enable_debug).
Change-Id: I830e1ae21fe3e278a5f7591065d066c0a6883a9a
Closes-Bug: #1785635
To match the previous functionality when not using config-download, the
common deploy step tasks should be skipped for already deployed nodes
when using --skip-deploy-identifier.
This patch adds a task to check if one of the json configuration files
created by the common tasks already exists. If it does, and
--skip-deploy-identifier has caused an empty DeployIdentifier parameter
value, the tasks will be skipped for that node.
Change-Id: I711dbb00a9c34dbd96ef179ef41bff281b0001d1
Closes-Bug: #1796924
So far the tasks for external update/upgrade were not using the step
mechanism as other tasks, we had a single step. As external
deploy/update/upgrade tasks are being used for more things nowadays,
it's likely that we'll need to go towards a similar model like we have
for deploy/update/upgrade tasks -- proper usage of steps.
For now we have just 2:
* Step 0 for setting global facts, and performing validations.
* Step 1 for actual update/upgrade tasks. (There's an upcoming change
to run online data migrations in step 1).
Change-Id: I1933bd0eedab71caab56c0e5d93ba7927fb7c20f
Partial-Bug: #1793332
This adds a tag step[1-5] to each of the plays within the jinja2 loop to
create our 5 deployment steps. Using these tags, it's possible to run
these plays individually if desired.
Change-Id: Ic705afbf174b4597d98c2b83041ff88dd8d6664c
Create a new parameter in TripleO: ContainerCli.
The default is set to 'docker' for backward compatibility but it allows
to also set to 'podman'.
When podman is selected, the right commands will be run so docker-puppet
can configure the containers when Podman is the selected container
library backend.
It removes the tripleo_logs:/var/log/tripleo/ mount that was used
by tripleo-ui but we shouldn't do that here. We'll create a bind mount
in tripleo-ui container later.
It run puppet with FACTER_hostname only if NET_HOST is disabled.
Change-Id: I240b15663b720d6bd994d5114d43d51fa26d76cc
Co-Authored-by: Martin André <m.andre@redhat.com>
When blacklisting all servers from the primary role, the yaql expression
to get the bootstrap_server_id value fails as it tries to index the list
at the 0'th element. In this case, default the bootstrap_server_id value
to a constant string which won't match any actual server id's.
Change-Id: Ibb26245156675f64709bab075875ce4b498b4db6
Closes-Bug: #1785665
Not all vars were getting passed to deploy-steps-tasks.yaml when using
config-download. This didn't cause any issue because all the vars have
default value, but the user specified value should be honored as well.
Change-Id: I5972e1c674cf9008366c2bb10b54eb975ab8cb93
Closes-Bug: #1785635
Update the play for the server pre and post steps so that the tasks run
in parallel across all roles, instead of doing one role at a time. By
not using the "when" attribute, and relying on the tripleo_role_name var
for the list of deployments, we can force these tasks to run in parallel
across all roles.
Change-Id: I83a4deaa68d5788edb5ab13652bb30c762f337d8
Running `openstack overcloud external-update run` will update all
external services. This commit adds possibility of running `openstack
overcloud external-update run --tags ceph` to specifically update just
Ceph. It works analogically for upgrades.
Change-Id: Ic1786b6dbfa54516bfb836b450fc35452dca8cb5
Partial-Bug: #1783949
Composable service templates can now define external_update_tasks and
external_upgrade_tasks. They are meant for update/upgrade logic of
services deployed via external_deploy_tasks. The external update
playbook first executes external_update_tasks and then
external_deploy_tasks, the procedure for upgrades works
analogously. All happens within a single playbook, so variables or
fact overrides exported from the update/upgrade tasks will be
available to the deploy tasks during the update/upgrade procedure.
Partial-Bug: #1783949
Change-Id: Ib2474e8f69711cd6610a78884d5032ffd19ad249
"undercloud" host is too opinionated and hostnames can change. We should
rather apply the tasks to the Undercloud HostGroup, which contains one
host for now: the actual undercloud hostname.
So this patch switches "undercloud" to "Undercloud" so when the hostname
isn't "undercloud", the external tasks will run correctly on this host.
Change-Id: I7200f930387406e6cc8e6fee6d5278768074c892
Closes-Bug: #1784910
host_prep_tasks are run from deploy_steps_playbook.yaml, so there's no
need to also run them as part of the {{role}}HostPrepDeployment
resources.
Change-Id: If1bf6dda19e6e0b875463c421f9504efab85251b
Problem: RHEL and CentOS8 will deprecate the usage of Yum.
From DNF release note:
DNF is the next upcoming major version of yum, a package
manager for RPM-based Linux distributions.
It roughly maintains CLI compatibility with YUM and defines a strict API for
extensions.
Solution: Use "package" Ansible module instead of "yum".
"package" module is smarter when it comes to detect with package manager
runs on the system. The goal of this patch is to support both yum/dnf
(dnf will be the default in rhel/centos 8) from a single ansible module.
Change-Id: I8e67d6f053e8790fdd0eb52a42035dca3051999e
Deploy steps run the docker puppet steps with max of
a 3 processes. This takes like 30 min to finish the
containers configuration for a typical overcloud (in CI).
Double the numbers to allow more puppets finish threir
tasks sooner.
Change-Id: Id0b0371e7f21f56528027921732ade786525d659
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
include module is deprecated. The alternate is to use import_tasks
for static file tasks inclusion and include_tasks for dynamic tasks
file inclusions (like using with_items).
Change-Id: I8b3bf3ba3d7c2cfbe1187218c51f619e65efe0e5
We drop the post_update_steps_playbook, and execute post_update_tasks
as part of the update_steps_playbook. This will ensure that
post_update_tasks are executed, and they're executed in accordance
with the `serial: 1` ordering that update_steps_playbook is using. (We
want to avoid an issue similar to what we've had in bug #1776206).
Change-Id: I15a984172cd5532bc966269d8c68f27b5703733e
Closes-Bug: #1778471
The ansible command generated in ansible-playbook-command.sh by default
have "--become" in it.
This commit removes "become: true" where is used to avoid confusion in
deploy steps. Today we explicitly set "become: false" in deploy-steps.j2
for certain actions, so there's no meaning of having also "become: true"
for the other ones.
We have a release note [1] that explains why the "become" was
introduces, but maybe we can revisit it.
[1] releasenotes/notes/use-become-true-in-deploy-steps-playbook-01decb18d895879f.yaml
Change-Id: Ic666b4ecaecf0591dd8bb0406f239649b20b9623
We should re-run host_prep_tasks as part of the minor update, to make
sure the host is ready for starting the updated containers. The right
place for them is between update tasks and deployment tasks.
This is important in case we deliver changes to host_prep_tasks during
minor update, or if update_tasks do something that would partially
undo the host preparation, e.g. clear/delete some directories on the
host to get rid of previous state.
Change-Id: Ic0a905a8c4691cbe75113131bd84e8a39dea046d
Related-Bug: #1776206
In order to make the deployment more flexible, we should allow for the
ansible hosts to be configurable from the old undercloud/overcloud
concepts. Rather than assume 'undercloud'/'overcloud', we should allow
for these to include the same set of hosts. This change introduces
'deployment_source_hosts' and 'deployment_target_hosts' variables that
can be used to control where the tasks are run on.
Change-Id: I249cc7e179bc1423788aab967c4b2e3f9ffc81d4
Related-Blueprint: all-in-one
In I4b576a6e7fbfb18fa13221e2d080bf7876a8303e state information
will be persisted in Swift and the name of the Swift container
should be a function of the Heat stack in case multiple stacks
are deployed. This patch passes the name of the Heat stack to
the Mistral environment so that the workflow may access the
Heat stack name and name the Swift container accordingly.
Change-Id: I995ad32345a39238ffb9cbcf9966dedc60c75ff8
Related-Bug: #1769769
"role_name" is internal to Ansible, we should not use it.
This patch uses the new variable set in the inventory to use a specific
TripleO var: tripleo_role_name which is the TripleO role name and not
the Ansible role names, both things are very different.
Depends-On: I57c4eac87e2f96dfe5490b111cd2508505715d56
Change-Id: Iecaf6f1b830e65be2f9e2e44431054fe46f9f565
Related-Bug: #1771171
For NFV deployments, specific kernel args should be applied and
the nodes should be restarted before running the NetworkDeployment.
It is supported in the heat deployment via PreNetworkConfig. In the
config-download mechanism, ansible steps need to be improved
to handle the reboot and wait for the node.
Change-Id: I43b383ad0e04b8be6c321f8c5b05e628b2520141
The new master branch should point now to rocky.
So, HOT templates should specify that they might contain features
for rocky release [1]
Also, this submission updates the yaml validation to use only latest
heat_version alias. There are cases in which we will need to set
the version for specific templates i.e. mixed versions, so there
is added a variable to assign specific templates to specific heat_version
aliases, avoiding the introductions of error by bulk replacing the
the old version in new releases.
[1]: https://docs.openstack.org/heat/latest/template_guide/hot_spec.html#rocky
Change-Id: Ib17526d9cc453516d99d4659ee5fa51a5aa7fb4b
This patch adds the primary role name as the first host pattern in the
individual plays in deploy-steps.j2. This will ensure that the primary
role will execute tasks first, which is needed so that all Pacemaker
nodes run the same step at the same time.
Change-Id: I9c499be87ce51ae28914b013b4b91446a3a68015
Closes-Bug: #1768238
So far we haven't been disabling workflows for update/upgrade. We
should disable them by default as they could have the potential to
disrupt the update/upgrade/ffwd procedure.
The main example of a thing we deploy via the workflow resources is
Ceph. We decided no-opping ceph-ansible for the main
update/upgrade/ffwd upgrade steps is the safest path forward and we'll
update/upgrade Ceph it after the main procedure is finished.
Change-Id: I34c7213ab7b70963ad2e50f7633b665fad70bde5
These tasks would run before any individual server deployments. A
specific use case is for rebooting dpdk/nfv nodes before applying
NetworkDeployment, etc.
Change-Id: I9e410def25184e635568db67149264ac89c999ed
Add blank lines between the Ansible tasks and plays in the stack
outputs. This is an improvement in readability for the user.
Change-Id: I52ebd9081cacf213ac29f1d24e73db6ea5cfe33f
In I75f087dc456c50327c3b4ad98a1f89a7e012dc68 we removed much of
the legacy upgrade workflow. This now also removes the
disable_upgrade_deployment flag and the tripleo_upgrade_node.sh
script, both of which are no longer used and have no effect on
the upgrade.
Related reviews
I7b19c5299d6d60a96a73cafaf0d7103c3bd7939d tripleo-common
I4227f82168271089ae32cbb1f318d4a84e278cc7 python-tripleoclient
Change-Id: Ib340376ee80ea42a732a51d0c195b048ca0440ac
Add support for the SshKnownHostsDeployment resources to
config-download. Since the deployment resources relied on Heat outputs,
they were not supported with the default handling from tripleo-common
that relies on the group_vars mechanism.
Instead, this patch refactors the templates to add the known hosts
entries as global_vars to deploy_steps_playbook.yaml, and then includes
the new tripleo-ssh-known-hosts role from tripleo-common to apply the
same configuration that the Heat deployment did.
Since these deployments no longer need to be triggered when including
config-download-environment.yaml, a mapping is added that can be
overridden to OS::Heat::None to disable the deployment resources when
using config-download.
The default behavior when not using config-download remains unchanged.
Closes-Bug: #1746336
Change-Id: Ia334fe6adc9a8ab228f75cb1d0c441c1344e2bd9
The resultin pre_upgrade_rolling_steps_playbook will be executed in a
node-by-node rolling fashion at the beginning of major upgrade
workflow (before upgrade_steps_playbook).
The current intended use case is special handling of L3 agent upgrade
when moving Neutron services into containers. Special care needs to be
taken in this case to preserve L3 connectivity of instances (with
regard to dnsmasq and keepalived sub-processes of L3 agent).
The playbook can be run before the main upgrade like this:
openstack overcloud upgrade run --roles overcloud --playbook pre_upgrade_rolling_steps_playbook.yaml
Partial-Bug: #1738768
Change-Id: Icb830f8500bb80fd15036e88fcd314bf2c54445d
Implements: blueprint major-upgrade-workflow
In last step of FFU we need to swich repos before running upgrade.
We do so by introducing post FFU steps and running the switch in
them. We also update heat agents and os-collect-config on nodes.
Change-Id: I649afc6fa384ae21edc5bc917f8bb586350e5d47