* Updates etcd to v3.4
* Updated the config to use v3.4's logging mechanism
* Deprecated etcd CA parameters aren't used, so we are not affected
by their removal.
* Note that we are not currently guarding against skip-version updates for
etcd.
Notable non-voting jobs exercising some of this:
* kolla-ansible-ubuntu-upgrade-cephadm (cinder->tooz->etcd3gw->etcd)
* kolla-ansible-ubuntu-zun (see
https://review.opendev.org/c/openstack/openstack-ansible/+/883194 )
Depends-On: https://review.opendev.org/c/openstack/kolla/+/890464
Change-Id: I086e7bbc7db64421445731a533265e7056fbdb43
* etcd service containers usually have a set of
environment parameters required to boot the container.
* The short-lived etcd bootstrap containers pass extra
ETCD_INITIAL_* environment variables, but still need to
pass the ones that the service containers use.
* This uses ansible's `combine` filter to cut down on the
duplication.
* This is intended to be just a straightforward refactor.
Change-Id: I04e95f92a8f365553afd618d58b99de595d48312
This commit addresses a few shortcomings in the etcd service:
* Adding or removing etcd nodes required manual intervention.
* The etcd service would have brief outages during upgrades or
reconfigures because restarts weren't always serialised.
This makes the etcd service follow a similar pattern to mariadb:
* There is now a distiction between bootstrapping the cluster
and adding / removing another member.
* This more closely follows etcd's upstream bootstrapping
guidelines.
* The etcd role now serialises restarts internally so the
kolla_serial pattern is no longer appropriate (or necessary).
This does not remove the need for manual intervention in all
failure modes: the documentation has been updated to address the
most common issues.
Note that there's repetition in the container specifications: this
is somewhat deliberate. In a future cleanup, it's intended to reduce
the duplication.
Change-Id: I39829ba0c5894f8e549f9b83b416e6db4fafd96f
Changes name of ansible module kolla_docker to
kolla_container.
Change-Id: I13c676ed0378aa721a21a1300f6054658ad12bc7
Signed-off-by: Martin Hiner <m.hiner@partner.samsung.com>
etcd-compatible tooz drivers do not support multiple endpoints via
backend_url. We can put a loadbalancer in front of etcd and configure
backend_url to use the VIP instead. The issue with hard coding the first
host is that we break coordination if we take this host offline. In the
case of cinder, we would not be able to perform any volume related
operations.
Co-Authored-By: Mark Goddard <mark@stackhpc.com>
Change-Id: Ib684501ba03c386dc5ac71e5cbea05c99f191665
When running in check mode, some prechecks previously failed because
they use the command module which is silently not run in check mode.
Other prechecks were not running correctly in check mode due to e.g.
looking for a string in empty command output or not querying which
containers are running.
This change fixes these issues.
Closes-Bug: #2002657
Change-Id: I5219cb42c48d5444943a2d48106dc338aa08fa7c
Regularly, we experience issues in Kolla Ansible deployments because we
use wrong options in OpenStack configuration files. This is because
OpenStack services ignore unknown options. We also need to keep on top
of deprecated options that may be removed in the future. Integrating
oslo-config-validator into Kolla Ansible will greatly help.
Adds a shared role to run oslo-config-validator on each service. Takes
into account that services have multiple containers, and these may also
use multiple config files. Service roles are extended to use this shared
role. Executed with the new command ``kolla-ansible validate-config``.
Change-Id: Ic10b410fc115646d96d2ce39d9618e7c46cb3fbc
Second part of patchset:
https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
in which was suggested to split patch into smaller ones.
This change adds container_engine variable to kolla_container_facts
module, this prepares module to be used with docker and podman as well
without further changes in roles.
Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Co-authored-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: I9e8fa30646844ab4a288555f3aafdda345b3a118
We get a nice optimisation by using a filtered loop instead
of task skipping per service with 'when'.
Partially-Implements: blueprint performance-improvements
Change-Id: I8f68100870ab90cb2d6b68a66a4c97df9ea4ff52
This reverts commit 9cae59be51e8d2d798830042a5fd448a4aa5e7dc.
Reason for revert: This patch was found to introduce issues with fluentd customisation. The underlying issue is not currently fully understood, but could be a sign of other obscure issues.
Change-Id: Ia4859c23d85699621a3b734d6cedb70225576dfc
Closes-Bug: #1906288
Main plays are action-redirect-stubs, ideal for import_tasks.
This avoids 'include' penalty and makes logs/ara look nicer.
Fixes haproxy and rabbitmq not to check the host group as well.
Change-Id: I46136fc40b815e341befff80b54a91ef431eabc0
Partially-Implements: blueprint performance-improvements
Config plays do not need to check containers. This avoids skipping
tasks during the genconfig action.
Ironic and Glance rolling upgrades are handled specially.
Swift and Bifrost do not use the handlers at all.
Partially-Implements: blueprint performance-improvements
Change-Id: I140bf71d62e8f0932c96270d1f08940a5ba4542a
Including tasks has a performance penalty when compared with importing
tasks. If the include has a condition associated with it, then the
overhead of the include may be lower than the overhead of skipping all
imported tasks. For unconditionally included tasks, switching to
import_tasks provides a clear benefit.
Benchmarking of include vs. import is available at [1].
This change switches from include_tasks to import_tasks where there is
no condition applied to the include.
[1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/include-and-import.md#task-include-and-import
Partially-Implements: blueprint performance-improvements
Change-Id: Ia45af4a198e422773d9f009c7f7b2e32ce9e3b97
Including tasks has a performance penalty when compared with importing
tasks. If the include has a condition associated with it, then the
overhead of the include may be lower than the overhead of skipping all
imported tasks. In the case of the check-containers.yml include, the
included file only has a single task, so the overhead of skipping this
task will not be greater than the overhead of the task import. It
therefore makes sense to switch to use import_tasks there.
Partially-Implements: blueprint performance-improvements
Change-Id: I65d911670649960708b9f6a4c110d1a7df1ad8f7
Both include_role and import_role expect role's name to be given
via "name" param instead of "role".
This worked but caused errors with ansible-lint.
See: https://review.opendev.org/694779
Change-Id: I388d4ae27111e430d38df1abcb6c6127d90a06e0
We assume that all groups are present in the inventory, and quite obtuse
errors can result if any are not.
This change adds a precheck that checks for the presence of all expected
groups in the inventory for each service. It also introduces a common
service-precheck role that we can use for other common prechecks.
Change-Id: Ia0af1e7df4fff7f07cd6530e5b017db8fba530b3
Partially-Implements: blueprint improve-prechecks
Sometimes as cloud admins, we want to only update code that is running
in a cloud. But we dont need to do anything else. Make an action in
kolla-ansible that allows us to do that.
Change-Id: I904f595c69f7276e71692696471e32fd1f88e6e8
Implements: blueprint deploy-containers-action
Currently, we have a lot of logic for checking if a handler should run,
depending on whether config files have changed and whether the
container configuration has changed. As rm_work pointed out during
the recent haproxy refactor, these conditionals are typically
unnecessary - we can rely on Ansible's handler notification system
to only trigger handlers when they need to run. This removes a lot
of error prone code.
This patch removes conditional handler logic for all services. It is
important to ensure that we no longer trigger handlers when unnecessary,
because without these checks in place it will trigger a restart of the
containers.
Implements: blueprint simplify-handlers
Change-Id: I4f1aa03e9a9faaf8aecd556dfeafdb834042e4cd
Many tasks that use Docker have become specified already, but
not all. This change ensures all tasks that use the following
modules have become:
* kolla_docker
* kolla_ceph_keyring
* kolla_toolbox
* kolla_container_facts
It also adds become for 'command' tasks that use docker CLI.
Change-Id: I4a5ebcedaccb9261dbc958ec67e8077d7980e496
With this change, an operator may be able to stop a
service container without stopping all services in a host.
This change is the starting point to start
fast-forward upgrades support.
In next changes new flags will be introducced to disable
stop dataplane services during upgrades.
Change-Id: Ifde7a39d7d8596ef0d7405ecf1ac1d49a459d9ef
Implements: blueprint support-stop-containers
Since I701d495675178c3ed8ec1f00b31d09f198b38a6f merged, etcd only runs
on the control hosts, not the compute hosts. We therefore no longer
require the etcd group to include the compute hosts.
Since the group mapping is now static, we can remove the use of
host_in_groups from the etcd service, in favour of the simpler method of
specifying the group.
Change-Id: Id8f888d7321a30a85ff95e742f7e6c8e2b9c696f
Related-Bug: #1790415
This commit is to apply resource-constraints only to few OpenStack services.
Commit to apply constraints to other services will be made in coming commits.
Partially-Implements: blueprint resource-constraints
Change-Id: Icafa54baca24d2de64238222a5677b9d8b90e2aa
Add become to all tasks that use the module "kolla_docker"
Change-Id: I4309c4011687b88ec31d739fd8f834fe2326ff10
Partial-Implements: blueprint ansible-specific-task-become
- rename action and serial to kolla_ansible and kolla_serial
- use become instead of "sudo <command>" in shell
- Remove quota for failed_when and changed_when in rabbitmq tasks
Change-Id: I78cb60168aaa40bb6439198283546b7faf33917c
Implements: blueprint migrate-to-ansible-2-2-0
In ansible/roles/etcd/tasks/config.yml, the kolla_docker
compare_container action doesn't check environment.
Once a container is created, it won't get recreated if only the
environment change. This commit add the environment attribute to the
kolla_docker action in etcd role
Change-Id: I8fb71cc945867e06acc67f6d1256bf62f4276206
Closes-Bug: #1765517
Kuryr need etcd on each compute node to store
network data.
Etcd is only deployed in controller nodes at this moment.
Also this change remove and useless bootstrap tasks.
Depends-On: I9c6c876773288c2f951966498db0ff8af090ac20
Change-Id: I8a84334e831fb15f6cbdd3bc34d2159638df6b85
Closes-Bug: #1697699
wait_for module waits 300 seconds for the port started or stopped. This
is meaningless and useless in precheck. This patch change timeout to 1
seconds.
Change-Id: I9b251ec4ba17ce446655917e8ef5e152ef947298
Closes-Bug: #1688152
Add a new subcommand 'check' to kolla-ansible, used to run the
smoke/sanity checks.
Add stub files to all services that don't currently have checks.
Change-Id: I9f661c5fc51fd5b9b266f23f6c524884613dee48
Partially-implements: blueprint sanity-check-container
During configuration file generation one uses local variable
api_interface that reference remote host. Instead we should
use hostvars to find out correct interface name for each host
mentioned in configuration.
Change-Id: I9f64fdf2cd18bcc0bbf1c4193349186d9a7658bc
Closes-Bug: #1650195
Add /etc/localtime:/etc/localtime:ro to volume for aodh, barbican, etcd,
gnocchi, kuryr and sahara.
All the containers are added in Netwon cycle, so no need to backport
Closes-Bug: #1633049
Change-Id: I9cdba54cf730af44fb1a9ff6f2c936d23dadbe9a
do_reconfigure.yml is introduced to use serial directive. But we use
it in wrong. Now serial has moved to playbook file. So it is time to
remove the do_reconfigure.yml file
Closes-Bug: #1628152
Change-Id: I8d42d27e6bc302a0e575b0353956eaef9b2ca9fd