This patch adds loadbalancer-config role
which is "wrapper" around haproxy-config
and proxysql-config role which will be added
in follow-up patches.
Change-Id: I64d41507317081e1860a94b9481a85c8d400797d
Kolla environment currently uses haproxy
to fullfill HA in mariadb. This patch
is switching haproxy to proxysql if enabled.
This patch is also replacing mariadb's user
'haproxy' with user 'monitor'. This replacement
has two reasons:
- Use better name to "monitor" galera claster
as there are two services using this user
(HAProxy, ProxySQL)
- Set password for monitor user as it's
always better to use password then not use.
Previous haproxy user didn't use password
as it was historically not possible with
haproxy and mariadb-clustercheck wasn't
implemented.
Depends-On: https://review.opendev.org/c/openstack/kolla/+/769385
Depends-On: https://review.opendev.org/c/openstack/kolla/+/765781
Depends-On: https://review.opendev.org/c/openstack/kolla/+/850656
Change-Id: I0edae33d982c2e3f3b5f34b3d5ad07a431162844
Role vars have a higher precedence than role defaults. This allows to
import default vars from another role via vars_files without overriding
project_name (see related bug for details).
Change-Id: I3d919736e53d6f3e1a70d1267cf42c8d2c0ad221
Related-Bug: #1951785
In case of running mariadb role with --limit the group_by module will only include the limited hosts and other hosts that are not limited by ansible will not be included.
Using add_host will add all hosts in mariadb group to their shards group
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
Change-Id: I1331698e313bd714a16fc35f38fb579d75b56370
Closes-Bug: #1947589
There seems to be a bug in Galera that causes
TASK [mariadb : Check MariaDB service WSREP sync status]
to fail.
One (in case of 3-node cluster) or more (possible with
more-than-3-node clusters) nodes may "lose the race" and get stuck
in the "initialized" state of WSREP.
This is entirely random as is the case with most race issues.
MariaDB service restart on that node will fix the situation but
it's unwieldy.
The above may happen because Kolla Ansible starts and waits for
all new nodes at once.
This did not bother the old galera (galera 3) which figured out
the ordering for itself and let each node join the cluster properly.
The proposed workaround is to start and wait for nodes serially.
Change-Id: I449d4c2073d4e3953e9f09725577d2e1c9d563c9
Closes-Bug: #1947485
"BINLOG MONITOR" and "SLAVE MONITOR" replace
"REPLICATION CLIENT" (which is now an alias for "BINLOG MONITOR").
The validation in Ansible MySQL collection is too simple to
understand aliases and breaks. Hence, let's use the canonical
names and adapt per service according to its needs.
Change-Id: I1175e4846384accd19942620dc155d0c5728e64b
We get a nice optimisation by using a filtered loop instead
of task skipping per service with 'when'.
Partially-Implements: blueprint performance-improvements
Change-Id: I8f68100870ab90cb2d6b68a66a4c97df9ea4ff52
By default, Ansible injects a variable for every fact, prefixed with
ansible_. This can result in a large number of variables for each host,
which at scale can incur a performance penalty. Ansible provides a
configuration option [0] that can be set to False to prevent this
injection of facts. In this case, facts should be referenced via
ansible_facts.<fact>.
This change updates all references to Ansible facts within Kolla Ansible
from using individual fact variables to using the items in the
ansible_facts dictionary. This allows users to disable fact variable
injection in their Ansible configuration, which may provide some
performance improvement.
This change disables fact variable injection in the ansible
configuration used in CI, to catch any attempts to use the injected
variables.
[0] https://docs.ansible.com/ansible/latest/reference_appendices/config.html#inject-facts-as-vars
Change-Id: I7e9d5c9b8b9164d4aee3abb4e37c8f28d98ff5d1
Partially-Implements: blueprint performance-improvements
The mariadb image was removed in Wallaby, leading to database backup
failures.
Change-Id: I90986e7521779997df2782767bb95efcbd8ef232
Closes-Bug: #1928129
- Replace hardcoded haproxy monitor user with variable.
- Rename mariadb_backup variable to mariadb_backup_possible.
- Drop creation of monitor user in handlers as this is
now handled in register.yml for good reason.
Change-Id: I255a79d36ae18ca42d0befd00b235ca509197db3
Kolla-ansible is currently installing mariadb
cluster on hosts defined in group['mariadb']
and render haproxy configuration for this hosts.
This is not enough if user want to have several
service databases in several mariadb clusters (shards).
Spread service databases to multiple clusters (shards)
is usefull especially for databases with high load
(neutron,nova).
How it works ?
It works exactly same as now, but group reference 'mariadb'
is now used as group where all mariadb clusters (shards)
are located, and mariadb clusters are installed to
dynamic groups created by group_by and host variable
'mariadb_shard_id'.
It also adding special user 'shard_X' which will be used
for creating users and databases, but only if haproxy
is not used as load-balance solution.
This patch will not affect user which has all databases
on same db cluster on hosts in group 'mariadb', host
variable 'mariadb_shard_id' is set to 0 if not defined.
Mariadb's task in loadbalancer.yml (haproxy) is configuring
mariadb default shard hosts as haproxy backends. If mariadb
role is used to install several clusters (shards), only
default one is loadbalanced via haproxy.
Mariadb's backup is working only for default shard (cluster)
when using haproxy as mariadb loadbalancer, if proxysql
is used, all shards are backuped.
After this patch will be merged, there will be way for proxysql
patches which will implement L7 SQL balancing based on
users and schemas.
Example of inventory:
[mariadb]
server1
server2
server3 mariadb_shard_id=1
server4 mariadb_shard_id=1
server5 mariadb_shard_id=2
server6 mariadb_shard_id=3
Extra:
wait_for_loadbalancer is removed instead of modified as its role
is served by check already. The relevant refactor is applied as
well.
Change-Id: I933067f22ecabc03247ea42baf04f19100dffd08
Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
This trivial patch is just turning off ansible
changed report for group_by tasks as it could
be confusing for user.
Change-Id: I7512af573782359a6f01290a55291ac7eb0de867
Need to consider Negative seqno to compare in some cases,
but the task does not support to do that, we need to make it work.
1.we use mariabackup to restore datas on control1, delete the
mariadb data on control2 and control3, and then use cluster recovery,
as a result that the seqno of the other two nodes will be '-1'.
2. add one more control node into our existing mariadb cluster,
and then use cluster recovery, the seqno of the new node will be '-1'.
Change-Id: Ic1ac8656f28c3835e091637014f075ac5479d390
The handler was firing even when we were only generating config.
This is an issue because the services may not have been deployed.
TrivialFix
Change-Id: I2f832d73138b4c9f29e3c71e2463293eab71483a
This reverts commit 9cae59be51e8d2d798830042a5fd448a4aa5e7dc.
Reason for revert: This patch was found to introduce issues with fluentd customisation. The underlying issue is not currently fully understood, but could be a sign of other obscure issues.
Change-Id: Ia4859c23d85699621a3b734d6cedb70225576dfc
Closes-Bug: #1906288
Mariadb recovery fails if a cluster has previously been deployed, but any of
the mariadb containers do not exist.
Steps to reproduce
==================
* Deploy a mariadb galera cluster
* Remove the mariadb container from at least one host (docker rm -f mariadb)
* Run kolla-ansible mariadb_recovery
Expected results
================
The cluster is recovered, and a new container deployed where necessary.
Actual results
==============
The task 'Stop MariaDB containers' fails on any host where the container does
not exist.
Solution
========
This change fixes the issue by using the 'ignore_missing' flag for kolla_docker
with the stop_container action. This means the task does not fail when the
container does not exist. It is also necessary to swap some 'docker cp'
commands for 'cp' on the host, using the path to the volume.
Closes-Bug: #1907658
Change-Id: Ibd4a6adeb8443e12c45cbab65f501392ffb16fc7
CentOS 8 should work fine without the workaround.
This change adds the missing CentOS 8 IPv6 CI job as well.
Change-Id: I58af7a09b5ae09a10b9efc33c1f30c2efc6613f7
Main plays are action-redirect-stubs, ideal for import_tasks.
This avoids 'include' penalty and makes logs/ara look nicer.
Fixes haproxy and rabbitmq not to check the host group as well.
Change-Id: I46136fc40b815e341befff80b54a91ef431eabc0
Partially-Implements: blueprint performance-improvements
Config plays do not need to check containers. This avoids skipping
tasks during the genconfig action.
Ironic and Glance rolling upgrades are handled specially.
Swift and Bifrost do not use the handlers at all.
Partially-Implements: blueprint performance-improvements
Change-Id: I140bf71d62e8f0932c96270d1f08940a5ba4542a
Trivial: log-error & log-bin are both invalid mariadb config options.
The appropriate options are log_error & log_bin.
Note - this change mostly unnecessary as log_error is provided via cli
and log_bin value is the default.
Change-Id: If7051f7139a68864e599cccffaf17c21855fc4a8
Since change [1] merged we have two mariadb images (mariadb and mariadb-server)
Let's use mariadb-server in kolla-ansible, so we can deprecate mariadb image.
[1]: https://review.opendev.org/#/c/710217/
Change-Id: I4ae2ccaaba8fb516f469f4ce8628e8c61de03f0d
Including tasks has a performance penalty when compared with importing
tasks. If the include has a condition associated with it, then the
overhead of the include may be lower than the overhead of skipping all
imported tasks. For unconditionally included tasks, switching to
import_tasks provides a clear benefit.
Benchmarking of include vs. import is available at [1].
This change switches from include_tasks to import_tasks where there is
no condition applied to the include.
[1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/include-and-import.md#task-include-and-import
Partially-Implements: blueprint performance-improvements
Change-Id: Ia45af4a198e422773d9f009c7f7b2e32ce9e3b97
Previously we mounted /etc/timezone if the kolla_base_distro is debian
or ubuntu. This would fail prechecks if debian or ubuntu images were
deployed on CentOS. While this is not a supported combination, for
correctness we should fix the condition to reference the host OS rather
than the container OS, since that is where the /etc/timezone file is
located.
Change-Id: Ifc252ae793e6974356fcdca810b373f362d24ba5
Closes-Bug: #1882553