19 Commits

Author SHA1 Message Date
Ivan Halomi
910f9bd36f Usage of kolla_container_engine variable instead of docker
First part of patchset:
 https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
in which was suggested to split patch into smaller ones.

This implements kolla_container_engine variable
in command calls of docker,so later on it can be
also used for podman without further change.

Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Change-Id: Ic30b67daa2e215524096ad1f4385c569e3d41b95
2022-10-28 09:15:55 +02:00
Radosław Piliszek
c94cc4a61a [mariadb] Start new nodes serially
There seems to be a bug in Galera that causes
TASK [mariadb : Check MariaDB service WSREP sync status]
to fail.
One (in case of 3-node cluster) or more (possible with
more-than-3-node clusters) nodes may "lose the race" and get stuck
in the "initialized" state of WSREP.
This is entirely random as is the case with most race issues.
MariaDB service restart on that node will fix the situation but
it's unwieldy.
The above may happen because Kolla Ansible starts and waits for
all new nodes at once.
This did not bother the old galera (galera 3) which figured out
the ordering for itself and let each node join the cluster properly.
The proposed workaround is to start and wait for nodes serially.

Change-Id: I449d4c2073d4e3953e9f09725577d2e1c9d563c9
Closes-Bug: #1947485
2021-10-17 07:58:46 +00:00
Zuul
f86a810b72 Merge "Fix "Restart mariadb-clustercheck container" during config gen" 2021-05-10 18:50:41 +00:00
Michal Arbet
5d17100118 Additional small changes in role/mariadb
- Replace hardcoded haproxy monitor user with variable.
 - Rename mariadb_backup variable to mariadb_backup_possible.
 - Drop creation of monitor user in handlers as this is
   now handled in register.yml for good reason.

Change-Id: I255a79d36ae18ca42d0befd00b235ca509197db3
2021-04-14 16:10:30 +02:00
Michal Arbet
09b3c6ca07 Refactor mariadb to support shards
Kolla-ansible is currently installing mariadb
cluster on hosts defined in group['mariadb']
and render haproxy configuration for this hosts.

This is not enough if user want to have several
service databases in several mariadb clusters (shards).

Spread service databases to multiple clusters (shards)
is usefull especially for databases with high load
(neutron,nova).

How it works ?

It works exactly same as now, but group reference 'mariadb'
is now used as group where all mariadb clusters (shards)
are located, and mariadb clusters are installed to
dynamic groups created by group_by and host variable
'mariadb_shard_id'.

It also adding special user 'shard_X' which will be used
for creating users and databases, but only if haproxy
is not used as load-balance solution.

This patch will not affect user which has all databases
on same db cluster on hosts in group 'mariadb', host
variable 'mariadb_shard_id' is set to 0 if not defined.

Mariadb's task in loadbalancer.yml (haproxy) is configuring
mariadb default shard hosts as haproxy backends. If mariadb
role is used to install several clusters (shards), only
default one is loadbalanced via haproxy.

Mariadb's backup is working only for default shard (cluster)
when using haproxy as mariadb loadbalancer, if proxysql
is used, all shards are backuped.

After this patch will be merged, there will be way for proxysql
patches which will implement L7 SQL balancing based on
users and schemas.

Example of inventory:

[mariadb]
server1
server2
server3 mariadb_shard_id=1
server4 mariadb_shard_id=1
server5 mariadb_shard_id=2
server6 mariadb_shard_id=3

Extra:
wait_for_loadbalancer is removed instead of modified as its role
is served by check already. The relevant refactor is applied as
well.

Change-Id: I933067f22ecabc03247ea42baf04f19100dffd08
Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2021-04-07 23:19:42 +02:00
Will Szumski
ce012bcb09 Fix "Restart mariadb-clustercheck container" during config gen
The handler was firing even when we were only generating config.
This is an issue because the services may not have been deployed.

TrivialFix

Change-Id: I2f832d73138b4c9f29e3c71e2463293eab71483a
2021-01-21 12:07:21 +00:00
Michal Nasiadka
026f5cc48a Custom haproxy script for monitoring galera
Depends-On: https://review.opendev.org/710217/

Change-Id: I85652f23e487c40192106d23f2cdd45a3077deca
2020-05-20 13:02:44 +02:00
Radosław Piliszek
1ea029a91d Followup on MariaDB handling fixes
This fixes issues reported by Mark:
- possible failure with 4-node cluster (however unlikely)
- failure to stop all nodes from progressing when conditions are
  not valid (due to: "any_errors_fatal: False")

Change-Id: Ib6995bf4c99202c9813859b3d9e2f420448f0445
2020-02-02 16:39:29 +01:00
Radosław Piliszek
9f14ad651a Fix multiple issues with MariaDB handling
These affected both deploy (and reconfigure) and upgrade
resulting in WSREP issues, failed deploys or need to
recover the cluster.

This patch makes sure k-a does not abruptly terminate
nodes to break cluster.
This is achieved by cleaner separation between stages
(bootstrap, restart current, deploy new) and 3 phases
for restarts (to keep the quorum).

Upgrade actions, which operate on a healthy cluster,
went to its section.

Service restart was refactored.

We no longer rely on the master/slave distinction as
all nodes are masters in Galera.

Closes-bug: #1857908
Closes-bug: #1859145
Change-Id: I83600c69141714fc412df0976f49019a857655f5
2020-01-15 20:15:09 +01:00
Radosław Piliszek
6a737b1968 Fix handling of docker restart policy
Docker has no restart policy named 'never'. It has 'no'.
This has bitten us already (see [1]) and might bite us again whenever
we want to change the restart policy to 'no'.

This patch makes our docker integration honor all valid restart policies
and only valid restart policies.
All relevant docker restart policy usages are patched as well.

I added some FIXMEs around which are relevant to kolla-ansible docker
integration. They are not fixed in here to not alter behavior.

[1] https://review.opendev.org/667363

Change-Id: I1c9764fb9bbda08a71186091aced67433ad4e3d6
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-18 13:39:06 +00:00
Mark Goddard
86f373a198 Fixes for MariaDB bootstrap and recovery
* Fix wsrep sequence number detection. Log message format is
  'WSREP: Recovered position: <UUID>:<seqno>' but we were picking out
  the UUID rather than the sequence number. This is as good as random.

* Add become: true to log file reading and removal since
  I4a5ebcedaccb9261dbc958ec67e8077d7980e496 added become: true to the
  'docker cp' command which creates it.

* Don't run handlers during recovery. If the config files change we
  would end up restarting the cluster twice.

* Wait for wsrep recovery container completion (don't detach). This
  avoids a potential race between wsrep recovery and the subsequent
  'stop_container'.

* Finally, we now wait for the bootstrap host to report that it is in
  an OPERATIONAL state. Without this we can see errors where the
  MariaDB cluster is not ready when used by other services.

Change-Id: Iaf7862be1affab390f811fc485fd0eb6879fd583
Closes-Bug: #1834467
2019-07-05 09:20:34 +00:00
Mark Goddard
d93c604d7a Remove shutdown of MariaDB
Since we are now in the Train cycle, we can be sure that any running
MariaDB containers can be safely stopped, and we do not need to perform
an explicit shutdown prior to restarting them.

Change-Id: I5450690f1cbe0c995e8e4b01a76e90dac2574d61
Related-Bug: #1820325
2019-04-08 12:25:27 +01:00
Mark Goddard
b25c0ee477 Fix MariaDB 10.3 upgrade
Upgrading MariaDB from Rocky to Stein currently fails, with the new
container left continually restarting. The problem is that the Rocky
container does not shutdown cleanly, leaving behind state that the new
container cannot recover. The container does not shutdown cleanly
because we run dumb-init with a --single-child argument, causing it to
forward signals to only the process executed by dumb-init. In our case
this is mysqld_safe, which ignores various signals, including SIGTERM.
After a (default 10 second) timeout, Docker then kills the container.

A Kolla change [1] removes the --single-child argument from dumb-init
for the MariaDB container, however we still need to support upgrading
from Rocky images that don't have this change. To do that, we add new
handlers to execute 'mysqladmin shutdown' to cleanly shutdown the
service.

A second issue with the current upgrade approach is that we don't
execute mysql_upgrade after starting the new service. This can leave the
database state using the format of the previous release. This patch also
adds handlers to execute mysql_upgrade.

[1] https://review.openstack.org/644244

Depends-On: https://review.openstack.org/644244
Depends-On: https://review.openstack.org/645990
Change-Id: I08a655a359ff9cfa79043f2166dca59199c7d67f
Closes-Bug: #1820325
2019-03-23 10:21:37 +00:00
caoyuan
471985dc2c Update usage of "|" to "is"
With the more recent versions of ansible, we should now use
"is" instead of the "|"

This should update it.

Change-Id: I6fba56fca182349972e8b0ee5452b37aa4090e0c
2018-08-13 12:40:10 +05:30
Lakshmi Prasanna Goutham Pratapa
14bf524756 Apply Resource Constraints to Services.
This commit is to apply resource-constraints to a few more OpenStack services.
Commit to  apply constraints to the last set of services will be made in
the upcoming commit.

Depends-on: Icafa54baca24d2de64238222a5677b9d8b90e2aa
Change-Id: I39004f54281f97d53dfa4b1dbcf248650ad6f186
2018-07-26 11:35:28 +00:00
Ha Manh Dong
30be04ea91 Specify 'become' for all tasks that use kolla_docker module
Add become to all tasks that use the module "kolla_docker"

Change-Id: I4309c4011687b88ec31d739fd8f834fe2326ff10
Partial-Implements: blueprint ansible-specific-task-become
2018-06-08 12:39:24 +00:00
Jeffrey Zhang
c567055176 Fix ansible warning
- rename action and serial to kolla_ansible and kolla_serial
- use become instead of "sudo <command>" in shell
- Remove quota for failed_when and changed_when in rabbitmq tasks

Change-Id: I78cb60168aaa40bb6439198283546b7faf33917c
Implements: blueprint migrate-to-ansible-2-2-0
2018-05-11 02:54:02 +00:00
Mark Goddard
de56340f86 Fix kolla-ansible genconfig for mariadb
For the genconfig command, master_host will not be defined as it is
defined dynamically in bootstrap.yml.

Co-Authored-By: Stig Telfer <stig@stackhpc.com>
Change-Id: Ib988c8e2de475e9b973fed2f7f752cb2500953c3
Closes-Bug: #1707856
2017-09-25 10:53:36 +01:00
caoyuan
5cd55bf236 Optimize reconfiguration for mariadb
Change-Id: I278609f9832955849bc9381120a1b260f5a03f1b
Partially-implements: blueprint better-reconfigure
2017-07-22 08:50:08 +08:00