141 Commits

Author SHA1 Message Date
Ivan Halomi
7a9f04573a Adding container engine to kolla_container_facts
Second part of patchset:
https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
in which was suggested to split patch into smaller ones.

This change adds container_engine variable to kolla_container_facts
module, this prepares module to be used with docker and podman as well
without further changes in roles.

Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Co-authored-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: I9e8fa30646844ab4a288555f3aafdda345b3a118
2022-11-02 13:44:45 +01:00
Ivan Halomi
910f9bd36f Usage of kolla_container_engine variable instead of docker
First part of patchset:
 https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
in which was suggested to split patch into smaller ones.

This implements kolla_container_engine variable
in command calls of docker,so later on it can be
also used for podman without further change.

Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Change-Id: Ic30b67daa2e215524096ad1f4385c569e3d41b95
2022-10-28 09:15:55 +02:00
Michal Nasiadka
1aac65de0c Fix issues introduced by ansible-lint 6.6.0
mainly jinja spacing and jinja[invalid] related

Change-Id: I6f52f2b0c1ef76de626657d79486d31e0f47f384
2022-09-21 14:34:54 +00:00
Michal Arbet
63d72ea7e8 Use Docker healthchecks for mariadb-server service
This change enables the use of Docker healthchecks for
mariadb-server service.

Depends-On: https://review.opendev.org/c/openstack/kolla/+/805613
Change-Id: I893687a0501ea0f281b879df3141a354bff9eca6
2022-08-22 08:27:28 +00:00
Michal Arbet
4838591c6c Add loadbalancer-config role and wrap haproxy-config role inside
This patch adds loadbalancer-config role
which is "wrapper" around haproxy-config
and proxysql-config role which will be added
in follow-up patches.

Change-Id: I64d41507317081e1860a94b9481a85c8d400797d
2022-08-09 12:15:49 +02:00
Michal Arbet
de973b81fa Add proxysql support for database
Kolla environment currently uses haproxy
to fullfill HA in mariadb. This patch
is switching haproxy to proxysql if enabled.

This patch is also replacing mariadb's user
'haproxy' with user 'monitor'. This replacement
has two reasons:
  - Use better name to "monitor" galera claster
    as there are two services using this user
    (HAProxy, ProxySQL)
  - Set password for monitor user as it's
    always better to use password then not use.
    Previous haproxy user didn't use password
    as it was historically not possible with
    haproxy and mariadb-clustercheck wasn't
    implemented.

Depends-On: https://review.opendev.org/c/openstack/kolla/+/769385
Depends-On: https://review.opendev.org/c/openstack/kolla/+/765781
Depends-On: https://review.opendev.org/c/openstack/kolla/+/850656

Change-Id: I0edae33d982c2e3f3b5f34b3d5ad07a431162844
2022-07-29 15:05:21 +02:00
Michal Nasiadka
f940e6aa31 clustercheck: move from xinetd to socat
Needed for CentOS Stream 9 and Rocky Linux 9.

Change-Id: I614e64e227304fdc50c08bd16d67ccf03586b92c
2022-07-26 07:13:34 +00:00
Piotr Parczewski
377b6b4251 Improve MariaDB restore procedure
Change-Id: Id05122cb564f3e7475b2b76da8c111e2c72601b8
2022-01-11 21:47:21 +01:00
Seena Fallah
68cd2a0553 mariadb: use add_host to include inactive hosts in shard grouping
In case of running mariadb role with --limit the group_by module will only include the limited hosts and other hosts that are not limited by ansible will not be included.
Using add_host will add all hosts in mariadb group to their shards group

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
Change-Id: I1331698e313bd714a16fc35f38fb579d75b56370
Closes-Bug: #1947589
2021-10-28 16:29:05 +00:00
Michal Nasiadka
b6b7401c0d mariadb: Remove wsrep-notify.sh
Closes-Bug: #1947534

Change-Id: I08be074c3633cc4fb0a0bc6c9cb8d03eb5226d89
2021-10-20 05:37:57 +00:00
Radosław Piliszek
c7c14e1c43 Fix privileges for MariaDB 10.5
"BINLOG MONITOR" and "SLAVE MONITOR" replace
"REPLICATION CLIENT" (which is now an alias for "BINLOG MONITOR").
The validation in Ansible MySQL collection is too simple to
understand aliases and breaks. Hence, let's use the canonical
names and adapt per service according to its needs.

Change-Id: I1175e4846384accd19942620dc155d0c5728e64b
2021-10-07 09:24:31 +00:00
Radosław Piliszek
9ff2ecb031 Refactor and optimise image pulling
We get a nice optimisation by using a filtered loop instead
of task skipping per service with 'when'.

Partially-Implements: blueprint performance-improvements
Change-Id: I8f68100870ab90cb2d6b68a66a4c97df9ea4ff52
2021-08-10 11:57:54 +00:00
Zuul
980dd33721 Merge "mariadb: Deprecate wsrep-notify.sh" 2021-04-21 09:50:44 +00:00
Michał Nasiadka
451844ac67 mariadb: Deprecate wsrep-notify.sh
Change-Id: I14376dac46809f8bb466ec41f279be8d323d459d
2021-04-15 08:12:31 +00:00
Michal Arbet
5d17100118 Additional small changes in role/mariadb
- Replace hardcoded haproxy monitor user with variable.
 - Rename mariadb_backup variable to mariadb_backup_possible.
 - Drop creation of monitor user in handlers as this is
   now handled in register.yml for good reason.

Change-Id: I255a79d36ae18ca42d0befd00b235ca509197db3
2021-04-14 16:10:30 +02:00
Michał Nasiadka
d7a9be84d4 mariadb: Disable wsrep-notify script if clustercheck enabled
Change-Id: Id16ec7d7b57630ae20430675c4a196e63ca8d4a5
2021-04-14 09:46:20 +00:00
Michal Arbet
09b3c6ca07 Refactor mariadb to support shards
Kolla-ansible is currently installing mariadb
cluster on hosts defined in group['mariadb']
and render haproxy configuration for this hosts.

This is not enough if user want to have several
service databases in several mariadb clusters (shards).

Spread service databases to multiple clusters (shards)
is usefull especially for databases with high load
(neutron,nova).

How it works ?

It works exactly same as now, but group reference 'mariadb'
is now used as group where all mariadb clusters (shards)
are located, and mariadb clusters are installed to
dynamic groups created by group_by and host variable
'mariadb_shard_id'.

It also adding special user 'shard_X' which will be used
for creating users and databases, but only if haproxy
is not used as load-balance solution.

This patch will not affect user which has all databases
on same db cluster on hosts in group 'mariadb', host
variable 'mariadb_shard_id' is set to 0 if not defined.

Mariadb's task in loadbalancer.yml (haproxy) is configuring
mariadb default shard hosts as haproxy backends. If mariadb
role is used to install several clusters (shards), only
default one is loadbalanced via haproxy.

Mariadb's backup is working only for default shard (cluster)
when using haproxy as mariadb loadbalancer, if proxysql
is used, all shards are backuped.

After this patch will be merged, there will be way for proxysql
patches which will implement L7 SQL balancing based on
users and schemas.

Example of inventory:

[mariadb]
server1
server2
server3 mariadb_shard_id=1
server4 mariadb_shard_id=1
server5 mariadb_shard_id=2
server6 mariadb_shard_id=3

Extra:
wait_for_loadbalancer is removed instead of modified as its role
is served by check already. The relevant refactor is applied as
well.

Change-Id: I933067f22ecabc03247ea42baf04f19100dffd08
Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2021-04-07 23:19:42 +02:00
Michal Arbet
209dc1e9dc Set changed_when to false for group_by tasks
This trivial patch is just turning off ansible
changed report for group_by tasks as it could
be confusing for user.

Change-Id: I7512af573782359a6f01290a55291ac7eb0de867
2021-03-13 13:59:23 +00:00
fudunwei
068f3fea50 Negative seqno need to be considered when comparing seqno
Need to consider Negative seqno to compare in some cases,
but the task does not support to do that, we need to make it work.

1.we use mariabackup to restore datas on control1, delete the
mariadb data on control2 and control3, and then use cluster recovery,
 as a result that the seqno of the other two nodes will be '-1'.

2. add one more control node into our existing mariadb cluster,
and then use cluster recovery, the seqno of the new node will be '-1'.

Change-Id: Ic1ac8656f28c3835e091637014f075ac5479d390
2021-01-29 13:46:37 +08:00
fudunwei
aec42f0f5f Correct spell error
correct spell 'Cheching' to 'Checking'

Change-Id: I3ceb6960c3b38f371d0d4163ee37d4b34e61f401
2021-01-25 09:58:12 +08:00
Zuul
cb6ffa25e8 Merge "Fix mariadb_recovery when mariadb container is missing" 2020-12-16 09:36:54 +00:00
Mark Goddard
db4fc85c33 Revert "Performance: Use import_tasks in the main plays"
This reverts commit 9cae59be51e8d2d798830042a5fd448a4aa5e7dc.

Reason for revert: This patch was found to introduce issues with fluentd customisation. The underlying issue is not currently fully understood, but could be a sign of other obscure issues.

Change-Id: Ia4859c23d85699621a3b734d6cedb70225576dfc
Closes-Bug: #1906288
2020-12-14 10:36:55 +00:00
Mark Goddard
f903d774af Fix mariadb_recovery when mariadb container is missing
Mariadb recovery fails if a cluster has previously been deployed, but any of
the mariadb containers do not exist.

Steps to reproduce
==================

* Deploy a mariadb galera cluster
* Remove the mariadb container from at least one host (docker rm -f mariadb)
* Run kolla-ansible mariadb_recovery

Expected results
================

The cluster is recovered, and a new container deployed where necessary.

Actual results
==============

The task 'Stop MariaDB containers' fails on any host where the container does
not exist.

Solution
========

This change fixes the issue by using the 'ignore_missing' flag for kolla_docker
with the stop_container action. This means the task does not fail when the
container does not exist. It is also necessary to swap some 'docker cp'
commands for 'cp' on the host, using the path to the volume.

Closes-Bug: #1907658

Change-Id: Ibd4a6adeb8443e12c45cbab65f501392ffb16fc7
2020-12-10 12:27:25 +00:00
Radosław Piliszek
9cae59be51 Performance: Use import_tasks in the main plays
Main plays are action-redirect-stubs, ideal for import_tasks.

This avoids 'include' penalty and makes logs/ara look nicer.

Fixes haproxy and rabbitmq not to check the host group as well.

Change-Id: I46136fc40b815e341befff80b54a91ef431eabc0
Partially-Implements: blueprint performance-improvements
2020-10-27 19:09:32 +01:00
Radosław Piliszek
3411b9e420 Performance: optimize genconfig
Config plays do not need to check containers. This avoids skipping
tasks during the genconfig action.

Ironic and Glance rolling upgrades are handled specially.

Swift and Bifrost do not use the handlers at all.

Partially-Implements: blueprint performance-improvements
Change-Id: I140bf71d62e8f0932c96270d1f08940a5ba4542a
2020-10-12 19:30:06 +02:00
Mark Goddard
b685ac44e0 Performance: replace unconditional include_tasks with import_tasks
Including tasks has a performance penalty when compared with importing
tasks. If the include has a condition associated with it, then the
overhead of the include may be lower than the overhead of skipping all
imported tasks. For unconditionally included tasks, switching to
import_tasks provides a clear benefit.

Benchmarking of include vs. import is available at [1].

This change switches from include_tasks to import_tasks where there is
no condition applied to the include.

[1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/include-and-import.md#task-include-and-import

Partially-Implements: blueprint performance-improvements

Change-Id: Ia45af4a198e422773d9f009c7f7b2e32ce9e3b97
2020-08-28 16:12:03 +00:00
Mark Goddard
9702d4c3c3 Performance: use import_tasks for check-containers.yml
Including tasks has a performance penalty when compared with importing
tasks. If the include has a condition associated with it, then the
overhead of the include may be lower than the overhead of skipping all
imported tasks. In the case of the check-containers.yml include, the
included file only has a single task, so the overhead of skipping this
task will not be greater than the overhead of the task import. It
therefore makes sense to switch to use import_tasks there.

Partially-Implements: blueprint performance-improvements

Change-Id: I65d911670649960708b9f6a4c110d1a7df1ad8f7
2020-07-28 12:10:59 +01:00
Mark Goddard
b84d2f8b77 Fix handler notification for mariadb-clustercheck
This was missed in the original patch.

Change-Id: I991b0563560cf4a0b1feb718951ffdf21ab81856
2020-06-08 14:43:34 +01:00
Zuul
6f829575c9 Merge "Custom haproxy script for monitoring galera" 2020-06-02 15:01:55 +00:00
Michal Nasiadka
026f5cc48a Custom haproxy script for monitoring galera
Depends-On: https://review.opendev.org/710217/

Change-Id: I85652f23e487c40192106d23f2cdd45a3077deca
2020-05-20 13:02:44 +02:00
Zuul
87984f5425 Merge "Add Ansible group check to prechecks" 2020-04-16 15:33:46 +00:00
LinPeiWen
8a206699d4 mariadb container name variable
mariadb container name variable is fixed in some places,
but in the defaults directory, mariadb container_name variable
is variable. If the mariadb container_name variable is changed
during deployment, it will not be assigned to container_name,
but a fixed 'mariadb' name.

Change-Id: Ie8efa509953d5efa5c3073c9b550be051a7f4f9b
2020-03-25 01:17:29 -04:00
Radosław Piliszek
266fd61ad7 Use "name:" instead of "role:" for *_role modules
Both include_role and import_role expect role's name to be given
via "name" param instead of "role".
This worked but caused errors with ansible-lint.
See: https://review.opendev.org/694779

Change-Id: I388d4ae27111e430d38df1abcb6c6127d90a06e0
2020-03-02 10:01:17 +01:00
Mark Goddard
49fb55f182 Add Ansible group check to prechecks
We assume that all groups are present in the inventory, and quite obtuse
errors can result if any are not.

This change adds a precheck that checks for the presence of all expected
groups in the inventory for each service. It also introduces a common
service-precheck role that we can use for other common prechecks.

Change-Id: Ia0af1e7df4fff7f07cd6530e5b017db8fba530b3
Partially-Implements: blueprint improve-prechecks
2020-02-28 16:23:14 +00:00
Radosław Piliszek
1ea029a91d Followup on MariaDB handling fixes
This fixes issues reported by Mark:
- possible failure with 4-node cluster (however unlikely)
- failure to stop all nodes from progressing when conditions are
  not valid (due to: "any_errors_fatal: False")

Change-Id: Ib6995bf4c99202c9813859b3d9e2f420448f0445
2020-02-02 16:39:29 +01:00
Zuul
67a9d289b4 Merge "Fix multiple issues with MariaDB handling" 2020-01-21 09:29:59 +00:00
Radosław Piliszek
9f14ad651a Fix multiple issues with MariaDB handling
These affected both deploy (and reconfigure) and upgrade
resulting in WSREP issues, failed deploys or need to
recover the cluster.

This patch makes sure k-a does not abruptly terminate
nodes to break cluster.
This is achieved by cleaner separation between stages
(bootstrap, restart current, deploy new) and 3 phases
for restarts (to keep the quorum).

Upgrade actions, which operate on a healthy cluster,
went to its section.

Service restart was refactored.

We no longer rely on the master/slave distinction as
all nodes are masters in Galera.

Closes-bug: #1857908
Closes-bug: #1859145
Change-Id: I83600c69141714fc412df0976f49019a857655f5
2020-01-15 20:15:09 +01:00
Mark Goddard
5fb10e08fe Ansible lint: use command module instead of shell
Change-Id: Ibf40216b847f103e383f19fe1ef608a75fcfd452
Co-Authored-By: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
2020-01-13 10:45:10 +00:00
Mark Goddard
a6cb008c54 Ansible lint: task names
Change-Id: Iecbc2fe5fa3391dca5a3cc7e575314b95942114b
Co-Authored-By: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
2020-01-13 10:38:12 +00:00
Michal Nasiadka
1009931162 Change local_action to delegate_to: localhost
As part of the effort to implement Ansible code linting in CI
(using ansible-lint) - we need to implement recommendations from
ansible-lint output [1].

One of them is to stop using local_action in favor of delegate_to -
to increase readability and and match the style of typical ansible
tasks.

[1]: https://review.opendev.org/694779/

Partially implements: blueprint ansible-lint

Change-Id: I46c259ddad5a6aaf9c7301e6c44cd8a1d5c457d3
2019-11-22 15:04:44 +00:00
Mark Goddard
f979ae1f8e Fix restart policy after MariaDB recovery
After performing a recovery of MariaDB, the mariadb containers are left
without a restart policy. This leaves them unable to recover from the
crash of a single galera node. There is another issue, in that the
'master' node is left in a bootstrap configuration, with the
--wsrep-new-cluster argument configured as BOOTSTRAP_ARGS.

This change fixes these issues by removing the restart policy of 'no'
from the 'slave' containers, and recreating the master container without
the restart policy or bootstrap arguments.

Change-Id: I36c875611931163ca2c29ae93b71d3af64cb197c
Closes-Bug: #1851594
2019-11-07 10:08:57 +00:00
Zuul
6160cdc576 Merge "Use mariabackup for database backups" 2019-11-04 12:10:42 +00:00
Mark Goddard
7f47ddf7f4 Use mariabackup for database backups
Currently, Xtrabackup is used for database backups. However, Xtrabackup
is not compatible with MariaDB 10.3. This change switches to use
mariabackup [1], which is available in the mariadb image.

The documented full and incremental restore procedures have been
modified to use mariabackup, following [2] and [3].

[1] https://mariadb.com/kb/en/library/mariabackup-overview/
[2] https://mariadb.com/kb/en/library/full-backup-and-restore-with-mariabackup/
[3] https://mariadb.com/kb/en/library/incremental-backup-and-restore-with-mariabackup/

Change-Id: Id52b9b1f7b013277e401b1f6b8aed34473d2b2c4
Closes-Bug: #1843043
Depends-On: https://review.opendev.org/691290
2019-11-01 18:44:10 +00:00
Mark Goddard
a6372a66f2 Fix deploy-containers command for mariadb
The MariaDB handlers require master_host to be set.

TrivialFix

Change-Id: I162efbd9e615b86dcdc6e8a4af081cda2f8b0b2b
2019-10-25 17:20:52 +00:00
Kris Lindgren
2fe0d98ebb Add a job that *only* deploys updated containers
Sometimes as cloud admins, we want to only update code that is running
in a cloud.  But we dont need to do anything else.  Make an action in
kolla-ansible that allows us to do that.

Change-Id: I904f595c69f7276e71692696471e32fd1f88e6e8
Implements: blueprint deploy-containers-action
2019-09-26 17:51:14 +01:00
Zuul
d8e961eeaa Merge "Wait for MariaDB to be accessible via HAProxy" 2019-08-27 12:58:06 +00:00
Scott Solkhon
03cd7eb356 Wait for MariaDB to be accessible via HAProxy
Explicitly wait for the database to be accessible via the load balancer.
Sometimes it can reject connections even when all database services are up,
possibly due to the health check polling in HAProxy.

Closes-Bug: #1840145
Change-Id: I7601bb710097a78f6b29bc4018c71f2c6283eef2
2019-08-15 10:00:36 +00:00
Radosław Piliszek
6a737b1968 Fix handling of docker restart policy
Docker has no restart policy named 'never'. It has 'no'.
This has bitten us already (see [1]) and might bite us again whenever
we want to change the restart policy to 'no'.

This patch makes our docker integration honor all valid restart policies
and only valid restart policies.
All relevant docker restart policy usages are patched as well.

I added some FIXMEs around which are relevant to kolla-ansible docker
integration. They are not fixed in here to not alter behavior.

[1] https://review.opendev.org/667363

Change-Id: I1c9764fb9bbda08a71186091aced67433ad4e3d6
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-18 13:39:06 +00:00
Mark Goddard
86f373a198 Fixes for MariaDB bootstrap and recovery
* Fix wsrep sequence number detection. Log message format is
  'WSREP: Recovered position: <UUID>:<seqno>' but we were picking out
  the UUID rather than the sequence number. This is as good as random.

* Add become: true to log file reading and removal since
  I4a5ebcedaccb9261dbc958ec67e8077d7980e496 added become: true to the
  'docker cp' command which creates it.

* Don't run handlers during recovery. If the config files change we
  would end up restarting the cluster twice.

* Wait for wsrep recovery container completion (don't detach). This
  avoids a potential race between wsrep recovery and the subsequent
  'stop_container'.

* Finally, we now wait for the bootstrap host to report that it is in
  an OPERATIONAL state. Without this we can see errors where the
  MariaDB cluster is not ready when used by other services.

Change-Id: Iaf7862be1affab390f811fc485fd0eb6879fd583
Closes-Bug: #1834467
2019-07-05 09:20:34 +00:00
Mark Goddard
b123bf6621 Use become for all docker tasks
Many tasks that use Docker have become specified already, but
not all. This change ensures all tasks that use the following
modules have become:

* kolla_docker
* kolla_ceph_keyring
* kolla_toolbox
* kolla_container_facts

It also adds become for 'command' tasks that use docker CLI.

Change-Id: I4a5ebcedaccb9261dbc958ec67e8077d7980e496
2019-06-06 19:04:58 +01:00