This allows the install type for the project to be different than
kolla_install_type
This can be used to avoid hitting bug 1786238, since kuryr only supports
the source type.
Change-Id: I2b6fc85bac092b1614bccfd22bee48442c55dda4
Closes-Bug: #1786238
The MariaDB role HAProxy config section exposes MariaDB on the
mariadb_port which may not always be the same as database_port. The
HAProxy role checks that the database_port is free, and not the
mariadb_port. This could mean that the check passes, but the actual
port which HAProxy will attempt to use is taken.
This change configures HAProxy to talk to the MariaDB instances on
the mariadb_port, and maps them to the database_port which is used by
most services as part of the DB connection string.
There is a small risk that it may break someones override config.
Change-Id: I9507ee709cb21eb743112107770ed3170c61ef74
Explicitly wait for the database to be accessible via the load balancer.
Sometimes it can reject connections even when all database services are up,
possibly due to the health check polling in HAProxy.
Closes-Bug: #1840145
Change-Id: I7601bb710097a78f6b29bc4018c71f2c6283eef2
Docker has no restart policy named 'never'. It has 'no'.
This has bitten us already (see [1]) and might bite us again whenever
we want to change the restart policy to 'no'.
This patch makes our docker integration honor all valid restart policies
and only valid restart policies.
All relevant docker restart policy usages are patched as well.
I added some FIXMEs around which are relevant to kolla-ansible docker
integration. They are not fixed in here to not alter behavior.
[1] https://review.opendev.org/667363
Change-Id: I1c9764fb9bbda08a71186091aced67433ad4e3d6
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
* Fix wsrep sequence number detection. Log message format is
'WSREP: Recovered position: <UUID>:<seqno>' but we were picking out
the UUID rather than the sequence number. This is as good as random.
* Add become: true to log file reading and removal since
I4a5ebcedaccb9261dbc958ec67e8077d7980e496 added become: true to the
'docker cp' command which creates it.
* Don't run handlers during recovery. If the config files change we
would end up restarting the cluster twice.
* Wait for wsrep recovery container completion (don't detach). This
avoids a potential race between wsrep recovery and the subsequent
'stop_container'.
* Finally, we now wait for the bootstrap host to report that it is in
an OPERATIONAL state. Without this we can see errors where the
MariaDB cluster is not ready when used by other services.
Change-Id: Iaf7862be1affab390f811fc485fd0eb6879fd583
Closes-Bug: #1834467
We don't add extra volumes support for all services in patch [1].
In order to unify the management of the volume, so we need add extra volumes
support for these services.
[1] 12ff28a693
Change-Id: Ie148accdd8e6c60df6b521d55bda12b850c0d255
Partially-Implements: blueprint support-extra-volumes
Signed-off-by: ZijianGuo <guozijn@gmail.com>
Many tasks that use Docker have become specified already, but
not all. This change ensures all tasks that use the following
modules have become:
* kolla_docker
* kolla_ceph_keyring
* kolla_toolbox
* kolla_container_facts
It also adds become for 'command' tasks that use docker CLI.
Change-Id: I4a5ebcedaccb9261dbc958ec67e8077d7980e496
Since Ansible 2.5, the use of jinja tests as filters has been
deprecated.
I've run the script provided by the ansible team to 'fix' the
jinja filters to conform to the newer syntax.
This fixes the deprecation warnings.
Change-Id: I844ecb7bec94e561afb09580f58b1bf83a6d00bd
Closes-bug: #1827370
Since we are now in the Train cycle, we can be sure that any running
MariaDB containers can be safely stopped, and we do not need to perform
an explicit shutdown prior to restarting them.
Change-Id: I5450690f1cbe0c995e8e4b01a76e90dac2574d61
Related-Bug: #1820325
Several config file permissions are incorrect on the host. In general,
files should be 0660, and directories and executables 0770.
Change-Id: Id276ac1864f280554e98b937f2845bb424d521de
Closes-Bug: #1821579
Upgrading MariaDB from Rocky to Stein currently fails, with the new
container left continually restarting. The problem is that the Rocky
container does not shutdown cleanly, leaving behind state that the new
container cannot recover. The container does not shutdown cleanly
because we run dumb-init with a --single-child argument, causing it to
forward signals to only the process executed by dumb-init. In our case
this is mysqld_safe, which ignores various signals, including SIGTERM.
After a (default 10 second) timeout, Docker then kills the container.
A Kolla change [1] removes the --single-child argument from dumb-init
for the MariaDB container, however we still need to support upgrading
from Rocky images that don't have this change. To do that, we add new
handlers to execute 'mysqladmin shutdown' to cleanly shutdown the
service.
A second issue with the current upgrade approach is that we don't
execute mysql_upgrade after starting the new service. This can leave the
database state using the format of the previous release. This patch also
adds handlers to execute mysql_upgrade.
[1] https://review.openstack.org/644244
Depends-On: https://review.openstack.org/644244
Depends-On: https://review.openstack.org/645990
Change-Id: I08a655a359ff9cfa79043f2166dca59199c7d67f
Closes-Bug: #1820325
Those issues intermittently show up in various branches,
in all cases it's wrong path used to resolveip binary.
Similar to the recent kolla-ansible-ubuntu-source job failures.
Change-Id: I8cce42b60897e4ceb8d3b0bd5181fda88b10c2b8
- py35/py36 jobs are failing
python 3.6 pycache also includes links - so those also
need to be removed by tox testenv
- kolla-ansible-ubuntu-source job is failing
Without basedir set in galera.cnf - mysql_install_db looks for resolveip
in /usr/sbin, instead of /usr/bin, thus complains about cannot resolving
neither $HOSTNAME, nor localhost.
Change-Id: I40514c0a7c43ae01c7680aac81123942be1cdef9
xtrabackup doesnt work with mariadb 10.3,
need to be changed to mariadb-backup tool.
For now only migrate galera, not kolla-backup tool
to fix the CI.
https://jira.mariadb.org/browse/MDEV-15774
Change-Id: Ie77ae41e419873feed4b036a307887b22455183b
Depends-On: Icefe3a77fb12d57c869521000d458e3f58435374
With this change, an operator may be able to stop a
service container without stopping all services in a host.
This change is the starting point to start
fast-forward upgrades support.
In next changes new flags will be introducced to disable
stop dataplane services during upgrades.
Change-Id: Ifde7a39d7d8596ef0d7405ecf1ac1d49a459d9ef
Implements: blueprint support-stop-containers
blueprint database-backup-recovery
Introduce a new option, mariadb_backup, which takes a backup of all
databases hosted in MariaDB.
Backups are performed using XtraBackup, the output of which is saved to
a dedicated Docker volume on the target host (which defaults to the
first node in the MariaDB cluster).
It supports either full (the default) or incremental backups.
Change-Id: Ied224c0d19b8734aa72092aaddd530155999dbc3
Having all services in one giant haproxy file makes altering
configuration for a service both painful and dangerous. Each service
should be configured with a simple set of variables and rendered with a
single unified template.
Available are two new templates:
* haproxy_single_service_listen.cfg.j2: close to the original style, but
only one service per file
* haproxy_single_service_split.cfg.j2: using the newer haproxy syntax
for separated frontend and backend
For now the default will be the single listen block, for ease of
transition.
Change-Id: I6e237438fbc0aa3c89a3c8bd706a53b74e71904b
With the more recent versions of ansible, we should now use
"is" instead of the "|"
This should update it.
Change-Id: I6fba56fca182349972e8b0ee5452b37aa4090e0c
This commit is to apply resource-constraints to a few more OpenStack services.
Commit to apply constraints to the last set of services will be made in
the upcoming commit.
Depends-on: Icafa54baca24d2de64238222a5677b9d8b90e2aa
Change-Id: I39004f54281f97d53dfa4b1dbcf248650ad6f186
As reported in the bug, these can grow to 10s to 100s of GB
in a month. To reduce the chance of filling the disk and
bringing down the control plane this change defines
an expiry time.
Closes-Bug: 1720113
Change-Id: I508aad1f515d5108a3d08c90318b70d0a918908c
Add become to all tasks that use the module "kolla_docker"
Change-Id: I4309c4011687b88ec31d739fd8f834fe2326ff10
Partial-Implements: blueprint ansible-specific-task-become
Using mariadb service defined in default when boot bootstrap_mariadb
Not a bug here, just an enhancement.
Change-Id: I1f8b51fb6177a8524483e600701924dbfc3403cb
- rename action and serial to kolla_ansible and kolla_serial
- use become instead of "sudo <command>" in shell
- Remove quota for failed_when and changed_when in rabbitmq tasks
Change-Id: I78cb60168aaa40bb6439198283546b7faf33917c
Implements: blueprint migrate-to-ansible-2-2-0
Regex used to find the recover seqnum partition is not
returning the real num id rather a None.
Task fails due seqnum[0] is not iterable.
Change-Id: I1be55b6ebfc17c6d423e638662ec2a9f4b9b49a2
Closes-Bug: #1752128
This patchset implements yamllint test to all *.yml
files.
Also fixes syntax errors to make jobs to pass.
Change-Id: I3186adf9835b4d0cada272d156b17d1bc9c2b799
The purpose of this change is to improve upon
https://review.openstack.org/#/c/531122/
- Moved vars inside the defaults/main.yml file
- Made the regex for the lineinfile safer
Change-Id: Id581c0b36f3d4bd61d3627b8364b79296b967387
Closes-Bug: 1746567
Related-Bug: 1682153
In recover_cluster.yaml playbook the task to find the highest
seqno/Global Transaction ID is no longer relying only on grastate.dat
Instead it now follows the recommendations from galera cluster website
http://galeracluster.com/documentation-webpages/restartingcluster.html
Closes-Bug: 1682153
Change-Id: I5fc3eaa8baee659576c4c39aef9cfd351c8e9af7
Current debian stretch use mariadb 10.1.26 which integrates
a backup tool call 'MariaDB Backup' [1]. It is based on
Percona XtraBackup and support full backup capability for
MariaDB Server that includes encrypted and compressed data.
This patch also fixs muti-node deployment failed on Debian
aarch64. Percona's repo has no XtraBackup package for Debian
aarch64. In such case we can use MariaDB builtin backup tool
'MariaDB Backup'.
[1] https://mariadb.com/kb/en/library/mariadb-backup-overview/
Change-Id: I7271d3f93b41d4839670a2c4a358744333411cd7
Kolla assumes mariadb 10.0 where galera stuff was always enabled. And
this works for CentOS and Ubuntu as they use 10.0 on x86-64.
Debian uses mariadb 10.1 by default (and CentOS on !x86-64 and
Ubuntu/aarch64) where galera stuff is disabled by default.
Closes-Bug: 1740060
Co-Authored-By: Xinliang Liu <xinliang.liu@linaro.org>
Change-Id: I8374ac2219ad7880970cd789727d01af7cac1077
In some deployment scenarios the current timeouts
for mariadb bootstrap and kibana registration with
elasticsearch have been found to be too short. These
timeout increases shouldn't introduce any deployment
slowdown in current environment and eliminate
deployment failures in environments with slower
systems.
Change-Id: If33dfff2b42b23eff7ec2230c9b0c5a4c010072e
Added 'executable' argument to the shell action in the
'Comparing seqno value' task in the cluster recovery playbook.
Change-Id: I3e96a4a76b44ffb558b9a41cde16e66a8d0fab1a
Closes-Bug: #1729603
For the genconfig command, master_host will not be defined as it is
defined dynamically in bootstrap.yml.
Co-Authored-By: Stig Telfer <stig@stackhpc.com>
Change-Id: Ib988c8e2de475e9b973fed2f7f752cb2500953c3
Closes-Bug: #1707856
Add config_owner_user and config_owner_group to group_vars/all,
which is user and group of Kolla configuration files in /etc/kolla.
Add become to post-deploy playbook.
Add become to only neccesary tasks in roles:
- certificate
- common
- destroy
- haproxy
- mariadb
- memcached
- rabbitmq
Change-Id: I2aba745a6e3928c52642f64551470fd08cbfd058
Partial-Implements: blueprint ansible-specific-task-become
kolla-kubernetes is using its own configuration generation[0], so it is
time for kolla-ansible to remove the related code to simplify the
logical.
[0] https://github.com/openstack/kolla-kubernetes/tree/master/ansible
Change-Id: I7bb0b7fe3b8eea906613e936d5e9d19f4f2e80bb
Implements: blueprint clean-k8s-config
Ansible task support vars directive, no need implement another one in
merge_config. This patch remove the vars directive in merge_config
action plugin.
Change-Id: I33648a2b6e39b4d49ce76eb66fbf2522721f8c68
always_run is deprecated and removed in Ansible 2.4
check_mode is introduced in Ansible 2.2 and Kolla-ansible bump Ansible to
2.2.0 so it's safe to replace always_run by check_mode now.
Change-Id: Id1028d38b7bde30a6afe17b319dcdc77907914ab
Closes-Bug: #1643633
Implements: blueprint migrate-to-ansible-2-2-0
wait_for module waits 300 seconds for the port started or stopped. This
is meaningless and useless in precheck. This patch change timeout to 1
seconds.
Change-Id: I9b251ec4ba17ce446655917e8ef5e152ef947298
Closes-Bug: #1688152
Now, I see mariadb are using utf8_general_ci as a default collation.
- https://mariadb.com/kb/en/mariadb/supported-character-sets-and-collations/
This mean all of Devstack database will be created with utf8_general_ci collation,
so may be, one service/project can be deployed successfully via Devstack
but will be fail with Kolla deployment.
Therefore, we should use above default collation for Kolla-ansible.
Change-Id: Icbb6c15f536fc6986816c58f4fd68bfb95813e46
Closes-Bug: 1680783