Moved the DockerWorker class from module file into its separate file
in module_utils directory for future extension.
Unit tests changed accordingly.
Signed-off-by: Ivan Halomi <ivan.halomi@tietoevry.com>
Co-authored-by: Martin Hiner <martin.hiner@tietoevry.com>
Change-Id: Ia2a471a9a2805e13b2c20dbf8a7297c23231aae3
This change bumps up max supported Ansible version
to 4.x (ansible-core 2.11.x) and minimum to 2.10.
Change-Id: I8b9212934dfab3831986e8db55671baee32f4bbd
This is required to support Debian Bullseye (11) - need to set
nova-libvirt to use 'host' CgroupnsMode.
Change-Id: I40213d4092fa325bcf37bb1fb4437ab125fe328b
The proposed approach allows for checking whether config
files are current, e.g. cases when the deployment was aborted after
config files were generated but before they were injected into the
containers which lead to old config staying in containers.
After this patch we can do:
kolla-ansible genconfig
kolla-ansible deploy-containers
and it would do what we expected rather than being a noop
in the second part.
We also lose the need to have notifies
and whens in config and handler sections respectively.
This is optimised in a separate patch.
Future work:
- optimise for large files
- could we get away with comparing timestamps and sizes?
container's should have a newer timestamp due to copy,
could also preserve it
Change-Id: I1d26e48e1958f13b854d8afded4bfba5021a2dec
Closes-Bug: #1848775
Depends-On: https://review.opendev.org/c/openstack/kolla/+/773257
Co-Authored-By: Mark Goddard <mark@stackhpc.com>
Keepalived and haproxy cooperate to provide control plane HA in
kolla-ansible deployments.
Certain care should be exerted to avoid prolonged availability
loss during reconfigurations and upgrades.
This patch aims to provide this care.
There is nothing special about keepalived upgrade compared to
reconfig, hence it is simplified to run the same code as for
deploy.
The broken logic of safe upgrade is replaced by common handler
code which's goal is to ensure we down current master only after
we have backups ready.
This change introduces a switch to kolla_docker module that allows
to ignore missing containers (as they are logically stopped).
ignore_missing is the switch's name.
All tests are included.
Change-Id: I22ddec5f7ee4a7d3d502649a158a7e005fe29c48
W503 and W504 are incompatible and we need to choose one of them.
Existing codes follows W503, so we disable W504.
Change-Id: Ic745e956dd332eb0fa49b93c1e6acb12f8a7f26c
The repo is Python 3 now, so update hacking to version 3.0 which
supports Python 3.
Fix problems found by updated hacking version.
Remove hacking and friends from lower-constraints, they are not needed
during installation.
Change-Id: I7ef5ac8a89e94f5da97780198619b6facc86ecfe
Kolla-Ansible Ceph deployment mechanism has been deprecated in Train [1].
This change removes the Ansible code and associated CI jobs.
[1]: https://review.opendev.org/669214
Change-Id: Ie2167f02ad2f525d3b0f553e2c047516acf55bc2
This is to fix the duplicated words issue like
"Other services that are are out of scope of this".
Change-Id: Ie4882dbb64d6e8774888b97895af20ba3855f0f8
Adds support for configuration of the Docker client timeout via
'docker_client_timeout'.
This change also increases the default timeout to 120 seconds, as we
sometimes see timeouts in CI and heavily loaded or underpowered
environments. Increasing 'docker_client_timeout' further may be helpful
in cases where Docker reports 'Read timed out'.
Change-Id: I73745771078cb2c0ebae2b1d87ba2c4c12958d82
Closes-Bug: #1809844
* Deploy services using kolla-ansible deploy
* Reconfigure the image for one or more services to use an invalid
* config
* Deploy/reconfigure services using kolla-ansible reconfigure
The invalid config could be a wrong docker registry, wrong image name,
wrong tag, etc.
The restart handler for the service fails, and the old container is
left running.
The restart handler for the service fails, and the old container is
stopped and removed. This leaves the service in a broken state.
This change fixes the issue by pulling the image if necessary prior to
stopping and removing the container.
Change-Id: I85b2a1b224d4c4d85c32c4922a2cd2c41171a1dc
Closes-Bug: #1852572
This role can be used by other roles to register RabbitMQ resources.
Currently support is provided for creating virtual hosts and users.
Change-Id: Ie1774a10b4d629508584af679b8aa9e372847804
Partially Implements: blueprint support-nova-cells
Depends-On: https://review.opendev.org/684742
Since
70b515bf12
was merged, we implicitly require Docker API version 1.25
(https://docs.docker.com/engine/api/v1.25/) to support passing
environment variables to docker exec. The version of docker we deployed
before the Docker CE upgrade was 1.12.0, which is Docker API version
1.24, and so does not support this. We get the following error:
Setting environment for exec is not supported in API < 1.25
This change modifies the kolla_toolbox module to use the new JSON
method for parsing Ansible's output when Docker API 1.25 is available,
falling back to the old regex-based method otherwise.
This change can be reverted when we require a minimum Docker API version
of 1.25+.
Change-Id: Ie671624ecca5b43d7bd8fbd959d701d9e21d66b3
Closes-Bug: #1845681
The kolla_toolbox Ansible module executes as-hoc ansible commands in the
kolla_toolbox container, and parses the output to make it look as if
ansible-playbook executed the command. Currently however, this module
sometimes fails to catch failures of the underlying command, and also
sometimes shows tasks as 'ok' when the underlying command was changed.
This has been tested both before and after the upgrade to ansible 2.8.
This change fixes this issue by configuring ansible to emit output in
JSON format, to make parsing simpler. We can now pick up errors and
changes, and signal them to the caller.
This change also adds an ansible playbook, tests/test-kolla-toolbox.yml,
that can be executed to test the module. It's not currently integrated
with any CI jobs.
Note that this change cannot be backported as the JSON output callback
plugin was added in Ansible 2.5.
Change-Id: I8236dd4165f760c819ca972b75cbebc62015fada
Closes-Bug: #1844114
In order to orchestrate smooth transition to fluentd 0.14.x
aka 1.0 stable branch aka td-agent 3
from td-agent repository - use image labels (fluentd_version
and fluentd_binary).
Depends-On: https://review.opendev.org/676411
Change-Id: Iab8518c34ef876056c6abcdb5f2e9fc9f1f7dbdd
- add support for sha256 in bslurp module
- change sha1 to sha256 in ceph-mon ansible role
Depends-On: https://review.opendev.org/655623
Change-Id: I25e28d150f2a8d4a7f87bb119d9fb1c46cfe926f
Closes-Bug: #1826327
1) ceph-nfs (ganesha-ceph) - use NFSv4 only
This is recommended upstream.
v3 and UDP require portmapper (aka rpcbind) which we
do not want, except where Ubuntu ganesha version (2.6)
forces it by requiring enabled UDP, see [1].
The issue has been fixed in 2.8, included in CentOS.
Additionally disable v3 helper protocols and kerberos
to avoid meaningless warnings.
2) ceph-nfs (ganesha-ceph) - do not export host dbus
It is not in use. This avoids the temptation to try
handling it on host.
3) Properly handle ceph services deploy and upgrade
Upgrade runs deploy.
The order has been corrected - nfs goes after mds.
Additionally upgrade takes care of rgw for keystone
(for swift emulation).
4) Enhance ceph keyring module with error detection
Now it does not blindly try to create a keyring after
any failure. This used to hide real issue.
5) Retry ceph admin keyring update until cluster works
Reordering deployment caused issue with ceph cluster not being
fully operational before taking actions on it.
6) CI: Remove osd df from collected logs as it may hang CI
Hangs are caused by healthy MON and no healthy MGR.
A descriptive note is left in its place.
7) CI: Add 5s timeout to ceph informational commands
This decreases the timeout from the default 300s.
[1] https://review.opendev.org/669315
Change-Id: I1cf0ad10b80552f503898e723f0c4bd00a38f143
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
Docker has no restart policy named 'never'. It has 'no'.
This has bitten us already (see [1]) and might bite us again whenever
we want to change the restart policy to 'no'.
This patch makes our docker integration honor all valid restart policies
and only valid restart policies.
All relevant docker restart policy usages are patched as well.
I added some FIXMEs around which are relevant to kolla-ansible docker
integration. They are not fixed in here to not alter behavior.
[1] https://review.opendev.org/667363
Change-Id: I1c9764fb9bbda08a71186091aced67433ad4e3d6
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
It is possible to reference undefined variable in kolla-docker module if
DockerWorker object initialization fail, so the current behaviour will
crash the playbook with the unwanted error message :
UnboundLocalError: local variable 'dw' referenced before assignment
Change-Id: Ic8d26b11f93255220888b5406f8ab4a6f81736c2
Closes-Bug: #1819361
By default, docker containers inherit ulimit from limits of docker
deamon. On CentOS 7, docker daemon default NOFILE is 1048576.
It can found in /usr/lib/systemd/system/docker.service.
The big limit will cause many problem. we should control it in
production environment.
Change-Id: Iab962446a94ef092977728259d9818b86cfa7f68
Nova services may reasonably expect cell databases to exist when they
start. The current cell setup tasks in kolla run after the nova
containers have started, meaning that cells may or may not exist in the
database when they start, depending on timing. In particular, we are
seeing issues in kolla CI currently with jobs timing out waiting for
nova compute services to start. The following error is seen in the nova
logs of these jobs, which may or may not be relevant:
No cells are configured, unable to continue
This change creates the cell0 and cell1 databases prior to starting nova
services.
In order to do this, we must create new containers in which to run the
nova-manage commands, because the nova-api container may not yet exist.
This required adding support to the kolla_docker module for specifying a
command for the container to run that overrides the image's command.
We also add the standard output and error to the module's result when a
non-detached container is run. A secondary benefit of this is that the
output of bootstrap containers is now displayed in the Ansible output if
the bootstrapping command fails, which will help with debugging.
Change-Id: I2c1e991064f9f588f398ccbabda94f69dc285e61
Closes-Bug: #1808575
This change adds support to comfigure tty,
it was enabled by default but a recent patch
removed it. Some services such as Karaf in opendaylight
requires a TTY during startup.
Closes-Bug: #1806662
Change-Id: Ia4335523b727d0e45505cbb1efb40ccf04c27db7
With a pseudo terminal, service is not treated as a daemon
and signals would not work as expected.
Change-Id: I16aa29a7924df51659d973a81d8005ae3d86f57b
Related-Bug: #1799642
move the merge_yaml and merge_config module's DOCUMENTATION and
EXAMPLES into action_plugins.
Change-Id: I84c5b94afb870fc9a25838782389f7b1f8b882fd
Closes-Bug: #1799236
This commit is to apply resource-constraints only to few OpenStack services.
Commit to apply constraints to other services will be made in coming commits.
Partially-Implements: blueprint resource-constraints
Change-Id: Icafa54baca24d2de64238222a5677b9d8b90e2aa
This commit will constrain the dimensions of service `Nova`
and sub-containers deployed along with it.
A user can give the dimension values in `/etc/kolla/globals.yml`
the data-types just like stated in this commit.
Reference-Docs:
https://docs.docker.com/config/containers/resource_constraints/
Added Test-cases for the same.
Partially-Implements: blueprint resource-constraints
Change-Id: I6458d8fb7b26a6e7c3a9fd0d674d9cf129b0bf5d
This patch increases the default timeout for
the kolla_toolbox ansible module when talking
with the docker API from the default 60 to 180 secs.
This is required on slower deployments,
specially when bootstraping an environment and fernet
tokes are in usage. For faster deployments this will
be harmless, but for slower deployments this would be
beneficial.
Bug: #1767136
Change-Id: I0391715b16cf86d6c27fecf8a666de64f2735a7d
Signed-off-by: Jorge Niedbalski <jorge.niedbalski@linaro.org>
In old docker, if you do not specify ipc_mode, the default value is empty,
but in the latest docker, such as 17.09.0, if not specified, the default
is "IpcMode": "shareable", which will cause all containers to be deleted
and re-create when to redeploy or upgrade. This commit solves the
problem.
Change-Id: Ia8269b9c8066880e4aee23d6fdea8d9c04c41e44
Closes-Bug: #1747586
When upgrade from ceph Jewel to ceph luminous, the client.admin caps
should add `mgr 'allow *'` caps
Change-Id: Ia4cb7a59d4cf215a1dce1efe31e00f1401e0b753
Closes-Bug: #1750967
Missing container status check in recreate_or_restart_container,
this causes if the container is not running (kolla-ansible stop),
to not be started with deploy/reconfigure/upgrade if any other param
changes.
Change-Id: I5cff5f367e963ba8b1807ec46469da817e40e468
Closes-Bug: #1714015
In some case, docker can not remove container and raise following error
message:
Unable to remove filesystem for xxx remove
/var/lib/docker/containers/xxx/shm: device or resource busy
But the container is removed. This patch assumes container is
removed if only container name is not shown in docker ps.
Closes-Bug: #1662598
Change-Id: I079d5ec6178018403ec7a49c975f137e27eb9ad4
Ansible check if modules parameters are named like
%password% and allow to hidden log param in this case.
This requires adding "no_log" parameters.
This patch just add "no_log" param in order to avoid
this warning.
Change-Id: I9c1df1093e0fd101090292d6e8bf3527f99aeb17
Closes-Bug: #1702244
The pypi package 'docker-py' [1] has been renamed to 'docker' [2].
It is better to move to the new 'docker' package because the old
package will be deprecated and all the new features will go into
the new package only.
Package 'docker' has been added to requirements [3]. The old
package 'docker-py' is still allowed to be in the global requirements
during the transition period but it should be removed after all or
most of the projects finsih the migration.
[1] https://pypi.python.org/pypi/docker-py
[2] https://pypi.python.org/pypi/docker
[3] https://review.openstack.org/#/c/423715/
Change-Id: Ibcd5a57a1fbf55dcc5a690e41f20917f95b63da0