198 Commits

Author SHA1 Message Date
Zuul
e2efdbedc1 Merge "Allow containerized undercloud deploy with SELinux" 2017-11-08 08:13:44 +00:00
James Slagle
9ae319bf1e Set become:false for undercloud plays
Since the undercloud is localhost, ansible skips ssh and just runs local
commands. That will cause problems when running ansible-playbook under
the mistral workflow because the mistral user can not use sudo. Set
become:false on all the undercloud plays as sudo is not actually needed.

Change-Id: I2980c584d2f4ee5c2de3720eecfc80cc43ee1fa6
implements: blueprint ansible-config-download
2017-11-07 07:45:50 -05:00
Bogdan Dobrelya
1fc9285125 Allow containerized undercloud deploy with SELinux
When SELinux is enforcing, use the docker volume mount flag
:z for the docker-puppet tool's bind-mounted volumes in RW mode.
Note, if a volume mount with a Z, then the label will be specific
to the container, and not be able to be shared between containers.

Volumes from /etc/pki mounted RO do not require the context changes.
For those RO volumes that do require it, use :ro,z.

For deploy-steps, make sure ansible file resources in /var/lib/
are enforced the same SELinux context attributes what docker's :z
provides.

Partial-bug: #1682179
Related-bug: #1723003

Change-Id: Idc0caa49573bd88e8410d3d4217fd39e9aabf8f2
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
2017-11-06 15:04:18 +01:00
Zuul
e463ca15fb Merge "Speed up deployment by reusing facts" 2017-11-06 05:09:24 +00:00
Jiri Stransky
42b92efba3 Speed up deployment by reusing facts
The overcloud deployment playbook consists of several plays. Facts
gathered in a single playbook persist between plays, but by default
each play gathers facts from involved nodes before applying any
roles/tasks. That resulted in gathering facts many times.

This commit changes the deployment playbook so that we gather facts
once at the beginning, and then reuse them for subsequent plays.

Also any_errors_fatal is added to make sure that when one host fails,
subsequent tasks aren't attempted on the other hosts either.

Change-Id: I192ea99105bd188554d45a6e4290bb33d1f08ff1
2017-11-02 11:41:22 +01:00
Michele Baldessari
11e599d116 Add --detailed-exitcodes when running puppet via ansible
puppet run on never fails, even when it should, since we moved
to the ansible way of applying it. The reason is the current following code:

    - name: Run puppet host configuration for step {{step}}
      command: >-
        puppet apply
        --modulepath=/etc/puppet/modules:/opt/stack/puppet-modules:/usr/share/openstack-puppet/modules
        --logdest syslog --logdest console --color=false
        /var/lib/tripleo-config/puppet_step_config.pp

The above is missing the --detailed-exitcodes switch and so puppet will never
really error out on us and the deployment will keep on running all the
steps even though a previous puppet manifest might have failed. This
cause extra hard-to-debug failures.

Initially the issue was observed on the puppet host runs, but this
parameter is missing also from docker-puppet.py, so let's add it there
as well as it makes sense to return proper error codes whenever we call
puppet.

Besides this being a good idea in general, we actually *have* to do it
because puppet does not fail correctly without this option due to the
following puppet bug:
https://tickets.puppetlabs.com/browse/PUP-2754

Depends-On: I607927c2ee5c29b605e18e9294b0f91d37337680
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>

Change-Id: Ie9df4f520645404560a9635fb66e3af42b966f54
Closes-Bug: #1723163
2017-10-26 20:03:11 +00:00
Zuul
31488edbc4 Merge "Add external deployment tasks executed on undercloud" 2017-10-22 13:02:39 +00:00
Steven Hardy
afba3b12bd Default pre_deployments/post_deployments to empty lists
This is needed because these aren't defined by default when deploying
without config-download-environment.yaml and
https://review.openstack.org/#/c/508189/, but it's still useful to allow
re-running the deploy steps for debugging, e.g:

ansible-playbook -v -i /usr/bin/tripleo-ansible-inventory deploy_steps_playbook.yaml

Currently this doesn't work anymore, because these variables are undefined.

Change-Id: I2d99c1cb8bf4ccd8581e78d914d438e5de544219
2017-10-17 17:33:14 +01:00
Zuul
ba4a74665f Merge "Fix ConfigDebug for puppet host runs" 2017-10-16 14:40:34 +00:00
Michele Baldessari
ecc6ce340a Fix ConfigDebug for puppet host runs
Before pike we used to be able to add -e environments/config-debug.yaml
and that would give us debug logs for puppet. With the move to ansible
running puppet we lost this feature.

Let's make sure that the old ConfigDebug variable still works with
the ansible playbook-based deploy steps. With this patch and ConfigDebug
set to true, we correctly get the puppet debug logs:

TASK [debug] *******************************************************************
ok: [localhost] => {
    "(outputs.stderr|default('')).split('\n')|union(outputs.stdout_lines|default([]))": [
        "Warning: Undefined variable 'deploy_config_name'; ",
        "   (file & line not available)",
        "Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::Bool. There is further documentation for validate_legacy function in the README. at [\"/etc/puppet/modules/ntp/manifests/init.pp\", 54]:[\"/etc/puppet/modules/tripleo/manifests/profile/base/time/ntp.pp\", 29]",
        "   (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation')",
        "Debug: Runtime environment: puppet_version=4.8.2, ruby_version=2.0.0, run_mode=user, default_encoding=UTF-8",
        "Debug: Loading external facts from /etc/puppet/modules/openstacklib/facts.d",
        "Debug: Loading external facts from /var/lib/puppet/facts.d",
....

Change-Id: Ia726fb8ca4a6f7bbbd7a1284d76ff42df6825d01
Closes-Bug: #1722752
2017-10-16 08:50:55 +02:00
Jiri Stransky
80eff5f4d7 Add external deployment tasks executed on undercloud
Services can define external_deploy_tasks, which are meant to be
executed on the undercloud node. They are step-based as the other
Ansible tasks we have, and they get executed during each deployment
step before the puppet and docker tasks.

These tasks can be used to perform complex actions from the
undercloud, such as executing nested installers like kubespray or
ceph-ansible. This should allow deploying overcloud with a single
Ansible playbook, and without creating Ansible->Mistral->Ansible loop.

Implements: blueprint ansible-config-download
Change-Id: I3dcafb96f5cea5fdcebe2b2012b61a38b0568834
Depends-On: I8491540edf78711f3229eabeda22a17cd55e99c8
2017-10-13 17:24:54 +02:00
James Slagle
a0e6d30ca2 Config download support for standalone deployments
Presently, "openstack overcloud config download" does not support all
Deployment resources, only those included in the RoleData and are
natively of type group:ansible.

This patch adds support for also pulling all the deployment data for
OS::Heat::SoftwareDeployment (singular) resources applied to individual
servers of any group type. Those resources are mapped to a new nested
stack via the config-download-environment.yaml environment.

The nested stack has the same interface as a SoftwareDeployment but only
creates a OS::Heat::Value resource. The "config download" code will be
updated in a separate patch to read the deployment data from these Value
resources and apply them via ansible.

The related tripleo-common patch (which depends on this patch) is:
I7d7f6b831b8566390d8f747fb6f45e879b0392ba

implements: blueprint ansible-config-download
Change-Id: Ic2af634403b1ab2924c383035f770453f39a2cd5
2017-10-12 22:34:09 +00:00
Jenkins
8feb27ab0d Merge "Disable role host_prep_tasks on controlplane upgrade" 2017-10-04 02:59:21 +00:00
James Slagle
bb24fbfef3 Use "become: true" in deploy steps tasks
In the deploy steps playbook downloaded via "openstack overcloud config
download", all the tasks require sudo. The tasks should use "become:
true", otherwise they fail with permission denied errors.

Change-Id: I561b5ef6dee0ee7cac67ba798eda284fb7f7a8d0
Closes-Bug: #1717298
2017-09-27 13:17:33 -04:00
James Slagle
320f80dbae Start sequence at 1 for deploy steps playbook
We should start the sequence at 1 instead of 0, since all our puppet
manifests assume the first step is 1. Trying to run our puppet manifests
with a hieradata value of step=0 actually results in an error because no
classes are included.

Change-Id: I93dc8b4cefbd729ba7afa3a4d81b4ac95344cac2
Closes-Bug: #1717292
2017-09-26 11:29:39 -04:00
marios
684267a7a4 Disable role host_prep_tasks on controlplane upgrade
During the controlplane upgrade the host_prep_tasks are being
executed on the disable_upgrade_deployment roles too.

This sets the role specific host_prep_tasks to an empty list for
those roles during an upgrade, as executing them during the
controlplane upgrade (during -e
major-upgrade-composable-steps-docker.yaml) causes problems.

They will be executed as part of the non controller upgrade as they
are written to the stack outputs to be used as ansible playbooks
(see bug 1708115 for more info on this)

Change-Id: I42c963440b9b1e8222097c3d4e83ffcbe820886c
Closes-Bug: 1719604
2017-09-26 16:41:40 +03:00
Jenkins
4a6ab5bfb2 Merge "Remove deploy_steps_tasks.yaml from upgrade_steps_playbook" 2017-09-20 23:59:24 +00:00
Jenkins
52e1a0c943 Merge "Adds post_upgrade_tasks for any service post-upgrade ansible tasks" 2017-09-20 11:16:26 +00:00
Jenkins
ab682ed638 Merge "Rename service_workflow_tasks into workflow_tasks" 2017-09-14 17:01:44 +00:00
Marius Cornea
e471c67aab Remove deploy_steps_tasks.yaml from upgrade_steps_playbook
After landing https://review.openstack.org/#/c/503484/ we run the
puppet host configuration steps twice. This change removes the
deploy_steps_tasks.yaml playbook in order to run the puppet steps
only once.

Closes-bug: 1717244
Change-Id: I09461094618124915841c8390c8bce8daf64d029
2017-09-14 14:01:52 +02:00
Jenkins
3c6f26e890 Merge "Revert "Tag workflows created by the templates"" 2017-09-13 17:46:39 +00:00
Giulio Fidente
09137304b9 Rename service_workflow_tasks into workflow_tasks
Using the service_ prefix seems incoherent with its use in
service_config_settings (vs config_settings).

Change-Id: Ia39f181415bee0071409dabddfa0c5c312915e1f
2017-09-13 17:15:17 +02:00
Alex Schultz
ab7fd80008 Revert "Tag workflows created by the templates"
This reverts commit a7a02f0da866c66dce9757a42bf56144cfa70d5a.

This change requires a heat functionality which is not yet available so scenario001-containers job fails because of the new tags.  Reverting to unblock CI and this should be back after we have heat promotion

Change-Id: Ib0fed291c1c4e41d1ea0bb7fc2ccbdabac1d336b
Closes-Bug: #1716915
2017-09-13 12:42:28 +00:00
Jenkins
3dcc9b30e9 Merge "Tag workflows created by the templates" 2017-09-13 10:56:28 +00:00
marios
2e182bffee Adds post_upgrade_tasks for any service post-upgrade ansible tasks
This adds a new config/deployment per role that will come after any
post deploy steps. It drives the same ansible config as the
upgrade_tasks but instead collects the post_upgrade_tasks for any
service in the given role.

The workflow is upgrade_tasks, then post deploy steps (either
puppet/ or docker/ depending on the env) and then the
post_upgrade_tasks added here.

This is added to the pacemaker/cinder-volume.yaml service for now
see the bug below for more info

Change-Id: Iced34fecf02ebddc91df9302de54d2f4c2cab680
Closes-Bug: 1706951
2017-09-12 18:43:16 +03:00
Steven Hardy
27018b4182 Add RoleConfig output to major_upgrade_steps.j2.yaml
I96ec09bc788836584c4b39dcce5bf9b80e914c71 added this output to the
deploy-steps.j2, but missed adding this to the major upgrade template
which means the overcloud RoleConfig output is broken after the upgrade
(until the converge update switches back to the deploy-steps.j2 derived
template)

Closes-Bug: #1716404
Change-Id: I331fa18b456ca2d6c124316d513374e3fe5a5007
2017-09-12 09:58:50 +01:00
Giulio Fidente
a7a02f0da8 Tag workflows created by the templates
This is useful to easily filter workflows created by the templates
and for a specific stack.

Change-Id: I0a26cacaf5ad5709881043434694c9254a9e710b
Related-Bug: #1715389
2017-09-11 17:01:52 +00:00
Steven Hardy
94c7752cfa Set mode for ansible written files
Use a more restrictive mode for these files, as some may contain sensitive data
which shouldn't be world readable

Closes-Bug: #1714986
Change-Id: Ib1e79b1d4e25d6e329938402b1ca776bdab81bdd
2017-09-04 16:38:26 +01:00
Jenkins
e80b2f0191 Merge "Use list_concat in place of yaql" 2017-09-01 18:27:28 +00:00
Jenkins
70718ff4ca Merge "Remove puppet run and workarounds from tripleo_upgrade_node.sh" 2017-08-31 14:40:39 +00:00
Thomas Herve
8008089de2 Use list_concat in place of yaql
Where applicable, use list_concat instead of yaql to build new lists: it
should be more resilient to errors, easier to debug, and less expensive.

Change-Id: I6d3dbc7ee8eac50f46023a35af4ec7f2d378fd87
Related-Bug: #1714005
2017-08-30 15:43:16 +02:00
marios
4c5b9c5c96 Remove puppet run and workarounds from tripleo_upgrade_node.sh
For bug 1708115 and the O..P upgrade, and for the upgrade of
'non-controlers' we are now generating ansible playbooks from
collected service upgrade_tasks and these are executed instead
of the legacy tripleo_upgrade_node.sh.

To clarify, by 'non-controllers' it is meant any node for which
the corresponding roles_data.yaml role has the
disable_upgrade_deployment flag set True.

As a first pass, I am removing the workarounds from the script but
keeping its delivery mechanism for now in case it is needed still.
We can either update here to remove it or keep it until next cycle

The most important part for now is that we no longer 'manually'
run puppet here. Instead the post_deploy_steps are also collected
into a playbook and will be executed after the upgrade_tasks
(see the bug for discussion of the mechanism and related reviews)

Change-Id: Ib017b0ab435ca9558cf8659d434489cdf01df955
Related-Bug: 1708115
2017-08-29 14:29:37 +03:00
Dan Prince
949d367dde Add DockerPuppetProcessCount defaults to 3
docker-puppet.py is very aggressive about running concurrently.
It uses python multiprocessing to run multiple config generating
containers at once. This seems to work well in general, but
in some cases... perhaps when the registry is slow or under
heavy load can cause timeouts to occur. Lately I'm seeing
several 'container did not start before the specified timeout'
errors that always seem to occur when config files are generated
(docker-puppet.py is initially executed.

A couple of things:

 -when config files are generated this is the first time
  most of the containers are pulled to each host machine
  during deployment

 -docker-puppet.py runs many of these processes at once. Some
  of them run faster, other not.

 -docker daemon's pull limit defaults to 3. This would throttle
  the above a bit perhaps contributing the the likelyhood of a timeout.

One solution that seems to work for me is to set the PROCESS_COUNT
in docker-puppet.py to 3. As this matches docker daemon's default
it is probably safer at the cost of being slightly slower in some
cases.

Change-Id: I17feb3abd9d36fe7c95865a064502ce9902a074e
Closes-bug: #1713188
2017-08-25 23:01:24 -04:00
Mathieu Bultel
d9d8314d26 Specify the start count to 0 for the update step loop
Force the count start to 0 to ensure that the
update step loop will start to 0 and execute the
update step0

Closes-Bug: #1712498

Change-Id: I71be55c1f56e53e5c565bec281795d63e5845ff6
2017-08-23 08:25:57 +00:00
marios
060ff37c4f Also write an upgrade_tasks_playbook
To get this to work upgrade_tasks need to be rewritten with 'when'
statements like the update tasks (in parent review from shardy).
So that we don't break the existing upgrades workflow, we add these
as part of the config download see the depends on

Related-Bug: 1708115
Depends-On: Ief593dc758a2ffe33c1cbcbda9289393fcf023e4
Change-Id: Ib01b96a2c26721747d81d98e3d57c4c388663004
2017-08-15 11:48:48 +00:00
Steven Hardy
ec58a4b6c5 Add environment to disable deploy steps
This enables either deploying without configuring any services, or
temporarily disabling the deploy steps such as will be required
for minor updates where we want to re-run the rolling update outside
of heat.

To deploy directly via ansible-playbook you can do e.g:

openstack overcloud config download --config-dir tmpconfig
cd tmpconfig/tripleo-6b02U7-config
ansible-playbook -vvv -b -i /usr/bin/tripleo-ansible-inventory deploy_steps_playbook.yaml

Which will run the same ansible steps as we normally run via heat.

Change-Id: I59947b67523dfcc43d454d4ac7d82b06804cf71d
2017-08-12 10:40:57 +00:00
Steven Hardy
1801565a75 Add support for update_tasks
These work the same way as upgrade_tasks *but* they use a step variable
instead of tags, so we can iterate over a count/sequence which isn't
possibly via a wrapper playbook with tags (we may want to align upgrade
tasks with the same approach if this works out well).

Note the tasks can be run via ansible-playbook on the undercloud, like:

openstack overcloud config download --config-dir tmpconfig
cd tmpconfig/tripleo-HCrDA6-config
ansible-playbook -b -i /usr/bin/tripleo-ansible-inventory update_steps_playbook.yaml --limit controller

The above will do a rolling update for the Controller role (note the inconsistent
capitalization, we probably need to fix the group naming in tripleo-ansible-inventory)
because we specify serial: 1 in the playbook.

You can also trigger an update explicitly on one node like this, which is useful for debugging:

ansible-playbook -vvv -b -i /usr/bin/tripleo-ansible-inventory update_steps_playbook.yaml --limit overcloud-controller-0

Change-Id: I20bb3e26ab9d9cadf1a31fd304de8a014a901aa9
2017-08-12 10:40:48 +00:00
Steven Hardy
46279be9cb Add RoleConfig output
This exposes the deploy workflow for all roles from deploy-steps
via overcloud.j2.yaml - which means we can write it via the new
openstack overcloud config download command and/or run the workflow
outside of heat via mistral

With https://review.openstack.org/#/c/485732/ applied to
tripleoclient it becomes possible to do:

openstack overcloud config download --config-dir tmpconfig
cd tmpconfig/tripleo-EvEZk0-config
ansible-playbook -b -i /usr/bin/tripleo-ansible-inventory deploy_steps_playbook.yaml

This runs the deploy steps, exactly the same as normally run via heat
via ansible-playbook for all overcloud nodes (--limit can be used to restrict
to specific nodes/roles).

Change-Id: I96ec09bc788836584c4b39dcce5bf9b80e914c71
2017-08-12 10:40:41 +00:00
Steven Hardy
38db8e7d49 Default docker_puppet_debug to false
This isn't set unless the playbook is run via heat, so default it to false
to enable easier use via ansible-playbook combined with tripleo-ansible-inventory

Change-Id: I9705e4533831a019dd0051e5522d4b7958682506
2017-08-12 10:40:34 +00:00
Steven Hardy
76421eb249 Move deploy-steps-playbook to deploy-steps-tasks
So that we can more easily iterate over an include in an output

Change-Id: Idd5bb47589e5c37123caafcded1afbff8881aa33
2017-08-12 10:40:25 +00:00
Steven Hardy
7f6305980d Consolidate puppet/docker deployments with one deploy steps workflow
If we consolidate these we can focus on one implementation (the new ansible
based one used for docker-steps)

Change-Id: Iec0ad2278d62040bf03613fc9556b199c6a80546
Depends-On: Ifa2afa915e0fee368fb2506c02de75bf5efe82d5
2017-08-11 17:25:02 +00:00
Ben Nemec
4502b7cba6 Make RoleParameters and key_name descriptions consistent
The key_name default is ignored because the parameter is used in
some mutually exclusive environments where the default doesn't
need to be the same.

Change-Id: I77c1a1159fae38d03b0e59b80ae6bee491d734d7
Partial-Bug: 1700664
2017-08-02 16:18:25 -05:00
Steven Hardy
0a44085af6 Move docker_puppet_tasks calculation into services.yaml
This makes the RolesData output more accurate, and we can rework
things so docker-puppet only gets run when there is a non-empty
file calculated (e.g there are tasks to run).

Change-Id: I8cdab3c857977c80fe2e359ab9e05740a838d66b
2017-07-24 14:02:44 +03:00
Steven Hardy
d364d9cca2 Move services.yaml output calculation into Value resources
This stores the result of the yaql queries etc for easier debugging, and
also so there's no risk we constantly re-evaluate the expensive query
which can happen with some heat versions and configurations.

This also gives a nicer error when things go wrong as when a query fails
you know which resource had an error, and also the validation on resources
is currently stricter due to bug #1599114.  We also get some additional
type validation from each OS::Heat::Value resource, e.g it checks if the
calculated value is a valid map or list.

The final advantage (and the original motivation for doing this) is that
we can easily filter null values for any outputs where this isn't already
done, which makes the config data written via openstack overcloud config
download cleaner.

Change-Id: Ia6697cf2e47f3f7b727d620536e0873a985c98c4
2017-07-24 14:02:44 +03:00
Steven Hardy
2ff922b0dc Move step_config/docker_config calculation into services.yaml
Moving these means we get a more accurate output from the overcloud
RoleData output, which more closely reflects what is actually
deployed.

Change-Id: I154f36c1597cf4abe29ca0bfe15a54f507433fb1
2017-07-21 11:05:46 +01:00
Giulio Fidente
baf6eee501 Adds network/cidr mapping into a new service property
Makes it possible to resolve network subnets within a service
template; the data is transported into a new property ServiceData
wired into every service which hopefully is generic enough to
be extended in the future and transport more data.

Data can be consumed in service templates to set config values
which need to know what is the subnet where a deamon operates (for
example the Ceph Public vs Cluster network).

Change-Id: I28e21c46f1ef609517175f7e7ee19e28d1c0cba2
2017-07-14 13:44:04 +02:00
Steven Hardy
a6d2704468 Move services.yaml to common directory
This new directory has now been added to the RDO packaging so we
can move things common to both puppet/container architecture here,
starting with the recently combined services.yaml

Change-Id: If2ce27188c4c15002b3ad830e8d6eb9504d2f3d2
2017-07-13 13:41:19 +01:00
Steven Hardy
316cc2572d Remove duplicate docker/puppet services.yaml
Move to one common services.yaml not only reduces the duplication, but it
should improve performance for the docker/services.yaml case, because we were
creating two ResourceChains with $many services which we know can be really
slow (especially since we seem to be missing concurrent: true on one)

Change-Id: I76f188438bfc6449b152c2861d99738e6eb3c61b
2017-06-09 17:10:43 +01:00