125 Commits

Author SHA1 Message Date
Jiri Stransky
021d1b1efb Stop upgrade if a task on one node fails
When we were upgrading multiple nodes at the same time,
e.g. controllers, and a taks on one of the nodes failed, the other
nodes would keep upgrading. This is undersirable and can be fixed by
adding any_errors_fatal to the Ansible plays.

Change-Id: Iad2b5e32e955da41af4d2b8dd8ad8aa1eb5dffa9
Closes-Bug: #1804468
2018-11-21 15:48:53 +01:00
Emilien Macchi
eeb07fcb4a Honor --skip-deploy-identifier in common deploy tasks for updates
To continue the work that was done in
I711dbb00a9c34dbd96ef179ef41bff281b0001d1, we also need to skip the common
deploy tasks if --skip-deploy-identifier is passed by the operator.

When using --skip-deploy-identifier, the UpdateIdentifier is set to
None.
Ansible doesn't see None as "", so we really need to test if the
variable is defined or not. That patches changes the logic to test that.
We also support the case where the variable is set to "", and consider
is as empty which means we want to skip the deploy/updates.
It is also doing it for the update playbooks which includes tasks from
commont deploy.

It is not replicating the exact condition as in deploy_steps_playbook.
There is no need to also check if /var/lib/docker-container-startup-configs.json
file is here because it has been created during the initial deployment.

This fix the bug where --skip-deploy-identifier wasn't honored during
stack updates.

Co-Authored-By: Thomas Hervé <therve@redhat.com>
Co-Authored-By: Sam Doran <sdoran@redhat.com>

Change-Id: Ibab17dcaeebea65135fca4f40562109c90f36c27
Related-Bug: #1796924
2018-11-19 13:53:13 +00:00
Emilien Macchi
de798c5947 Use container_cli for post_upgrade_tasks & external_upgrade_tasks
- Export container_cli for post_upgrade_tasks & external_deploy_tasks
  and external_upgrade_tasks
- Replace "docker exec" by {{ container_cli }} exec in these tasks
  (cinder, nova, mysql, ironic and TLS).

Depends-On: Iff509f4dc09862a451ad5cf915aa7764a314c28c
Change-Id: I7b11f44c9255294863879aaff88d0dd1672bff6e
2018-11-05 12:00:46 -05:00
Emilien Macchi
da224f7a9c Export container_cli for update & upgrades & post upgrade tasks
container_cli will be used later by update, upgrade and post upgrade tasks.
This patch is separated from actual tasks, so we can quickly iterate in
multiple patches.

Change-Id: I1ed7dec0019113f1259bce986f354723237f6a25
2018-11-03 03:56:59 +00:00
James Slagle
5a5ad11d0b Add common vars to common plays
We should pass in the common vars to all the common plays in
deploy-steps.j2 so that tasks will have them available. Some of these
parameter driven variables were never actually wired up, so they didn't
work to begin with (such as enable_puppet/enable_debug).

Change-Id: I830e1ae21fe3e278a5f7591065d066c0a6883a9a
Closes-Bug: #1785635
2018-10-25 14:32:17 +02:00
Zuul
94943cfff9 Merge "Introduce proper steps to external update/upgrade tasks" 2018-10-17 15:03:47 +00:00
James Slagle
a7955832df Honor --skip-deploy-identifier in common deploy tasks
To match the previous functionality when not using config-download, the
common deploy step tasks should be skipped for already deployed nodes
when using --skip-deploy-identifier.

This patch adds a task to check if one of the json configuration files
created by the common tasks already exists. If it does, and
--skip-deploy-identifier has caused an empty DeployIdentifier parameter
value, the tasks will be skipped for that node.

Change-Id: I711dbb00a9c34dbd96ef179ef41bff281b0001d1
Closes-Bug: #1796924
2018-10-09 11:55:13 -04:00
Jiri Stransky
bcd6cde608 Introduce proper steps to external update/upgrade tasks
So far the tasks for external update/upgrade were not using the step
mechanism as other tasks, we had a single step. As external
deploy/update/upgrade tasks are being used for more things nowadays,
it's likely that we'll need to go towards a similar model like we have
for deploy/update/upgrade tasks -- proper usage of steps.

For now we have just 2:

* Step 0 for setting global facts, and performing validations.

* Step 1 for actual update/upgrade tasks. (There's an upcoming change
  to run online data migrations in step 1).

Change-Id: I1933bd0eedab71caab56c0e5d93ba7927fb7c20f
Partial-Bug: #1793332
2018-10-04 12:08:21 +02:00
James Slagle
bf6efb06c7 Tag step plays
This adds a tag step[1-5] to each of the plays within the jinja2 loop to
create our 5 deployment steps. Using these tags, it's possible to run
these plays individually if desired.

Change-Id: Ic705afbf174b4597d98c2b83041ff88dd8d6664c
2018-09-24 09:17:15 -04:00
Emilien Macchi
e175e5ab2f Initial support for Podman in docker-puppet
Create a new parameter in TripleO: ContainerCli.
The default is set to 'docker' for backward compatibility but it allows
to also set to 'podman'.
When podman is selected, the right commands will be run so docker-puppet
can configure the containers when Podman is the selected container
library backend.

It removes the tripleo_logs:/var/log/tripleo/ mount that was used
by tripleo-ui but we shouldn't do that here. We'll create a bind mount
in tripleo-ui container later.

It run puppet with FACTER_hostname only if NET_HOST is disabled.

Change-Id: I240b15663b720d6bd994d5114d43d51fa26d76cc
Co-Authored-by: Martin André <m.andre@redhat.com>
2018-09-08 05:23:00 +00:00
Zuul
639a043f0d Merge "Allow performing Ceph update/upgrade separately" 2018-09-04 23:04:41 +00:00
Zuul
06c4507550 Merge "Parallelize server pre and post steps" 2018-08-21 19:03:14 +00:00
Zuul
46ef074336 Merge "Default bootstrap_server_id" 2018-08-19 02:50:12 +00:00
James Slagle
d4d15d0407 Default bootstrap_server_id
When blacklisting all servers from the primary role, the yaql expression
to get the bootstrap_server_id value fails as it tries to index the list
at the 0'th element. In this case, default the bootstrap_server_id value
to a constant string which won't match any actual server id's.

Change-Id: Ibb26245156675f64709bab075875ce4b498b4db6
Closes-Bug: #1785665
2018-08-06 17:46:08 -04:00
James Slagle
553fc0d264 Pass all vars to deploy-steps-tasks.yaml with config-download
Not all vars were getting passed to deploy-steps-tasks.yaml when using
config-download. This didn't cause any issue because all the vars have
default value, but the user specified value should be honored as well.

Change-Id: I5972e1c674cf9008366c2bb10b54eb975ab8cb93
Closes-Bug: #1785635
2018-08-06 10:15:56 -04:00
James Slagle
6b506eea2c Parallelize server pre and post steps
Update the play for the server pre and post steps so that the tasks run
in parallel across all roles, instead of doing one role at a time. By
not using the "when" attribute, and relying on the tripleo_role_name var
for the list of deployments, we can force these tasks to run in parallel
across all roles.

Change-Id: I83a4deaa68d5788edb5ab13652bb30c762f337d8
2018-08-06 13:26:59 +00:00
Jiri Stransky
4504aadef6 Allow performing Ceph update/upgrade separately
Running `openstack overcloud external-update run` will update all
external services. This commit adds possibility of running `openstack
overcloud external-update run --tags ceph` to specifically update just
Ceph. It works analogically for upgrades.

Change-Id: Ic1786b6dbfa54516bfb836b450fc35452dca8cb5
Partial-Bug: #1783949
2018-08-02 15:04:22 +02:00
Jiri Stransky
6364f2286c Update and upgrade tasks for services deployed via external deploy tasks
Composable service templates can now define external_update_tasks and
external_upgrade_tasks. They are meant for update/upgrade logic of
services deployed via external_deploy_tasks. The external update
playbook first executes external_update_tasks and then
external_deploy_tasks, the procedure for upgrades works
analogously. All happens within a single playbook, so variables or
fact overrides exported from the update/upgrade tasks will be
available to the deploy tasks during the update/upgrade procedure.

Partial-Bug: #1783949
Change-Id: Ib2474e8f69711cd6610a78884d5032ffd19ad249
2018-08-02 15:04:15 +02:00
Emilien Macchi
6860fb84f5 Switch deployment_source_hosts default to "Undercloud"
"undercloud" host is too opinionated and hostnames can change. We should
rather apply the tasks to the Undercloud HostGroup, which contains one
host for now: the actual undercloud hostname.

So this patch switches "undercloud" to "Undercloud" so when the hostname
isn't "undercloud", the external tasks will run correctly on this host.

Change-Id: I7200f930387406e6cc8e6fee6d5278768074c892
Closes-Bug: #1784910
2018-08-01 16:35:21 -04:00
Zuul
44514779bc Merge "Don't run host_prep_tasks from {{role}}HostPrepDeployment" 2018-07-26 06:58:56 +00:00
James Slagle
697b1d9438 Don't run host_prep_tasks from {{role}}HostPrepDeployment
host_prep_tasks are run from deploy_steps_playbook.yaml, so there's no
need to also run them as part of the {{role}}HostPrepDeployment
resources.

Change-Id: If1bf6dda19e6e0b875463c421f9504efab85251b
2018-07-23 19:37:36 -04:00
Emilien Macchi
b3a7cfc43f ansible: replace yum module by package module when possible
Problem: RHEL and CentOS8 will deprecate the usage of Yum.

From DNF release note:
DNF is the next upcoming major version of yum, a package
manager for RPM-based Linux distributions.
It roughly maintains CLI compatibility with YUM and defines a strict API for
extensions.

Solution: Use "package" Ansible module instead of "yum".

"package" module is smarter when it comes to detect with package manager
runs on the system. The goal of this patch is to support both yum/dnf
(dnf will be the default in rhel/centos 8) from a single ansible module.

Change-Id: I8e67d6f053e8790fdd0eb52a42035dca3051999e
2018-07-21 00:17:25 +00:00
Zuul
1f7062b0ef Merge "Add blankline for readability" 2018-07-13 06:09:09 +00:00
Zuul
65795c744a Merge "Remove unuseful become: true from deploy-steps" 2018-07-11 00:03:10 +00:00
Zuul
d2994ca593 Merge "Replace deprecated include module with include/import_tasks module" 2018-07-10 20:09:42 +00:00
Bogdan Dobrelya
f29e2c7a22 Double the docker puppet process counts
Deploy steps run the docker puppet steps with max of
a 3 processes. This takes like 30 min to finish the
containers configuration for a typical overcloud (in CI).

Double the numbers to allow more puppets finish threir
tasks sooner.

Change-Id: Id0b0371e7f21f56528027921732ade786525d659
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
2018-07-05 14:01:10 +03:00
Saravanan KR
1f9881554c Replace deprecated include module with include/import_tasks module
include module is deprecated. The alternate is to use import_tasks
for static file tasks inclusion and include_tasks for dynamic tasks
file inclusions (like using with_items).

Change-Id: I8b3bf3ba3d7c2cfbe1187218c51f619e65efe0e5
2018-07-05 16:01:14 +05:30
Jiri Stransky
6b3e8aa073 Execute post_update_tasks in update playbook
We drop the post_update_steps_playbook, and execute post_update_tasks
as part of the update_steps_playbook. This will ensure that
post_update_tasks are executed, and they're executed in accordance
with the `serial: 1` ordering that update_steps_playbook is using. (We
want to avoid an issue similar to what we've had in bug #1776206).

Change-Id: I15a984172cd5532bc966269d8c68f27b5703733e
Closes-Bug: #1778471
2018-06-25 10:39:56 +02:00
Raoul Scarazzini
c494a508f8 Remove unuseful become: true from deploy-steps
The ansible command generated in ansible-playbook-command.sh by default
have "--become" in it.
This commit removes "become: true" where is used to avoid confusion in
deploy steps. Today we explicitly set "become: false" in deploy-steps.j2
for certain actions, so there's no meaning of having also "become: true"
for the other ones.
We have a release note [1] that explains why the "become" was
introduces, but maybe we can revisit it.

[1] releasenotes/notes/use-become-true-in-deploy-steps-playbook-01decb18d895879f.yaml

Change-Id: Ic666b4ecaecf0591dd8bb0406f239649b20b9623
2018-06-13 16:28:48 +02:00
Jiri Stransky
416b35f4c3 Updates: run host_prep_tasks between update tasks and deployment tasks
We should re-run host_prep_tasks as part of the minor update, to make
sure the host is ready for starting the updated containers. The right
place for them is between update tasks and deployment tasks.

This is important in case we deliver changes to host_prep_tasks during
minor update, or if update_tasks do something that would partially
undo the host preparation, e.g. clear/delete some directories on the
host to get rid of previous state.

Change-Id: Ic0a905a8c4691cbe75113131bd84e8a39dea046d
Related-Bug: #1776206
2018-06-11 14:06:40 +02:00
James Slagle
e01005f7af Add blankline for readability
Improves readability of rednered deploy_steps_playbook.yaml playbook.

Change-Id: I327a905aaef3acb1e5a7939d7adbc806645f46f8
2018-06-05 09:05:13 -04:00
Zuul
25f583c640 Merge "Add stack name to env() for OS::TripleO::WorkflowSteps" 2018-05-31 11:27:15 +00:00
Alex Schultz
3a9c0e8420 Parameterized deployment hosts
In order to make the deployment more flexible, we should allow for the
ansible hosts to be configurable from the old undercloud/overcloud
concepts. Rather than assume 'undercloud'/'overcloud', we should allow
for these to include the same set of hosts.  This change introduces
'deployment_source_hosts' and 'deployment_target_hosts' variables that
can be used to control where the tasks are run on.

Change-Id: I249cc7e179bc1423788aab967c4b2e3f9ffc81d4
Related-Blueprint: all-in-one
2018-05-29 14:49:07 -06:00
Zuul
161156d750 Merge "NFV: Support for config-download to deploy node with kernel args" 2018-05-29 18:52:58 +00:00
John Fulton
d51bd9e1bd Add stack name to env() for OS::TripleO::WorkflowSteps
In I4b576a6e7fbfb18fa13221e2d080bf7876a8303e state information
will be persisted in Swift and the name of the Swift container
should be a function of the Heat stack in case multiple stacks
are deployed. This patch passes the name of the Heat stack to
the Mistral environment so that the workflow may access the
Heat stack name and name the Swift container accordingly.

Change-Id: I995ad32345a39238ffb9cbcf9966dedc60c75ff8
Related-Bug: #1769769
2018-05-28 21:41:09 +02:00
Emilien Macchi
1bec01137e deploy-steps: switch to tripleo_role_name
"role_name" is internal to Ansible, we should not use it.
This patch uses the new variable set in the inventory to use a specific
TripleO var: tripleo_role_name which is the TripleO role name and not
the Ansible role names, both things are very different.

Depends-On: I57c4eac87e2f96dfe5490b111cd2508505715d56
Change-Id: Iecaf6f1b830e65be2f9e2e44431054fe46f9f565
Related-Bug: #1771171
2018-05-15 16:38:29 +00:00
Saravanan KR
a3e4a90636 NFV: Support for config-download to deploy node with kernel args
For NFV deployments, specific kernel args should be applied and
the nodes should be restarted before running the NetworkDeployment.
It is supported in the heat deployment via PreNetworkConfig. In the
config-download mechanism, ansible steps need to be improved
to handle the reboot and wait for the node.

Change-Id: I43b383ad0e04b8be6c321f8c5b05e628b2520141
2018-05-15 11:01:06 +05:30
Carlos Camacho
44ef2a3ec1 Change template names to rocky
The new master branch should point now to rocky.

So, HOT templates should specify that they might contain features
for rocky release [1]

Also, this submission updates the yaml validation to use only latest
heat_version alias. There are cases in which we will need to set
the version for specific templates i.e. mixed versions, so there
is added a variable to assign specific templates to specific heat_version
aliases, avoiding the introductions of error by bulk replacing the
the old version in new releases.

[1]: https://docs.openstack.org/heat/latest/template_guide/hot_spec.html#rocky
Change-Id: Ib17526d9cc453516d99d4659ee5fa51a5aa7fb4b
2018-05-09 08:28:42 +02:00
James Slagle
1497da08d5 Run tasks on primary role first
This patch adds the primary role name as the first host pattern in the
individual plays in deploy-steps.j2. This will ensure that the primary
role will execute tasks first, which is needed so that all Pacemaker
nodes run the same step at the same time.

Change-Id: I9c499be87ce51ae28914b013b4b91446a3a68015
Closes-Bug: #1768238
2018-05-01 13:23:11 -04:00
Jiri Stransky
19be98ba07 No-op Mistral workflow resources for update/upgrade/ffwd
So far we haven't been disabling workflows for update/upgrade. We
should disable them by default as they could have the potential to
disrupt the update/upgrade/ffwd procedure.

The main example of a thing we deploy via the workflow resources is
Ceph. We decided no-opping ceph-ansible for the main
update/upgrade/ffwd upgrade steps is the safest path forward and we'll
update/upgrade Ceph it after the main procedure is finished.


Change-Id: I34c7213ab7b70963ad2e50f7633b665fad70bde5
2018-04-23 10:47:58 +00:00
Zuul
3fdb4c85a9 Merge "Add spacing for readability" 2018-04-10 03:54:22 +00:00
James Slagle
7089f06bd5 Support deploy_steps_tasks step 0
These tasks would run before any individual server deployments. A
specific use case is for rebooting dpdk/nfv nodes before applying
NetworkDeployment, etc.

Change-Id: I9e410def25184e635568db67149264ac89c999ed
2018-04-04 09:57:49 -04:00
James Slagle
0b23ff7ec9 Add spacing for readability
Add blank lines between the Ansible tasks and plays in the stack
outputs. This is an improvement in readability for the user.

Change-Id: I52ebd9081cacf213ac29f1d24e73db6ea5cfe33f
2018-04-03 11:56:15 -04:00
Zuul
8fd00675e8 Merge "Remove no longer used disable_upgrade_deployment flag" 2018-04-03 05:30:26 +00:00
Zuul
1e2cdd60aa Merge "Support SshKnownHostsDeployment with config-download" 2018-03-29 21:45:09 +00:00
mandreou
66df6bdb46 Remove no longer used disable_upgrade_deployment flag
In I75f087dc456c50327c3b4ad98a1f89a7e012dc68 we removed much of
the legacy upgrade workflow. This now also removes the
disable_upgrade_deployment flag and the tripleo_upgrade_node.sh
script, both of which are no longer used and have no effect on
the upgrade.

Related reviews
    I7b19c5299d6d60a96a73cafaf0d7103c3bd7939d tripleo-common
    I4227f82168271089ae32cbb1f318d4a84e278cc7 python-tripleoclient

Change-Id: Ib340376ee80ea42a732a51d0c195b048ca0440ac
2018-03-29 15:27:30 +03:00
Zuul
86204c805c Merge "Add EnablePuppet (defaults to true)" 2018-03-23 09:44:59 +00:00
James Slagle
088d5c12f0 Support SshKnownHostsDeployment with config-download
Add support for the SshKnownHostsDeployment resources to
config-download. Since the deployment resources relied on Heat outputs,
they were not supported with the default handling from tripleo-common
that relies on the group_vars mechanism.

Instead, this patch refactors the templates to add the known hosts
entries as global_vars to deploy_steps_playbook.yaml, and then includes
the new tripleo-ssh-known-hosts role from tripleo-common to apply the
same configuration that the Heat deployment did.

Since these deployments no longer need to be triggered when including
config-download-environment.yaml, a mapping is added that can be
overridden to OS::Heat::None to disable the deployment resources when
using config-download.

The default behavior when not using config-download remains unchanged.

Closes-Bug: #1746336
Change-Id: Ia334fe6adc9a8ab228f75cb1d0c441c1344e2bd9
2018-03-19 07:50:06 -04:00
Jiri Stransky
ae085825e2 Add pre_upgrade_rolling_tasks
The resultin pre_upgrade_rolling_steps_playbook will be executed in a
node-by-node rolling fashion at the beginning of major upgrade
workflow (before upgrade_steps_playbook).

The current intended use case is special handling of L3 agent upgrade
when moving Neutron services into containers. Special care needs to be
taken in this case to preserve L3 connectivity of instances (with
regard to dnsmasq and keepalived sub-processes of L3 agent).

The playbook can be run before the main upgrade like this:

openstack overcloud upgrade run --roles overcloud --playbook pre_upgrade_rolling_steps_playbook.yaml

Partial-Bug: #1738768
Change-Id: Icb830f8500bb80fd15036e88fcd314bf2c54445d
Implements: blueprint major-upgrade-workflow
2018-03-16 12:37:19 +01:00
Lukas Bezdicka
26c55d15cd FFU: Introduce post FFU steps and use them for qeens switch
In last step of FFU we need to swich repos before running upgrade.
We do so by introducing post FFU steps and running the switch in
them. We also update heat agents and os-collect-config on nodes.

Change-Id: I649afc6fa384ae21edc5bc917f8bb586350e5d47
2018-03-15 14:33:19 +01:00