These affected both deploy (and reconfigure) and upgrade
resulting in WSREP issues, failed deploys or need to
recover the cluster.
This patch makes sure k-a does not abruptly terminate
nodes to break cluster.
This is achieved by cleaner separation between stages
(bootstrap, restart current, deploy new) and 3 phases
for restarts (to keep the quorum).
Upgrade actions, which operate on a healthy cluster,
went to its section.
Service restart was refactored.
We no longer rely on the master/slave distinction as
all nodes are masters in Galera.
Closes-bug: #1857908
Closes-bug: #1859145
Change-Id: I83600c69141714fc412df0976f49019a857655f5
In CentOS/RHEL 8 there is no scsi-target-utils package, nor is it
available in EPEL. It is removed from kolla in [1]. In RHEL 7 and beyond
the LIO kernel subsystem can be used instead of the tgtd daemon.
This change removes support for the SCSI target daemon on CentOS/RHEL 8.
The 'tgtd' image is no longer available for CentOS/RHEL 8.
[1] https://review.openstack.org/#/c/613815/5
Change-Id: I718fc16cde2dd177b2a1c2f79b932426034897fe
Related: blueprint centos-rhel-8
This is to fix the duplicated words issue like
"Other services that are are out of scope of this".
Change-Id: Ie4882dbb64d6e8774888b97895af20ba3855f0f8
Variable added to evaluate "ENABLE_MONASCA" env for 'kolla/horizon'. In
case 'enable_horizon_monasca' is true, 'policy_item' would be called for
Monasca.
Change-Id: Ie9ecb8ab5d4e74af9b83a5b00ccced5b630ab1ed
Implements: blueprint monasca-ui
Signed-off-by: Hamed Bahadorzadeh <h.bahadorzadeh@gmail.com>
This change applys the HAProxy tag to the entire play, ensuring HAProxy
configuration is generated for all services when the HAProxy tag is
specified.
Change-Id: I67f57c831a713142d38c6e7b70f814a9ee8e5aae
Closes-Bug: #1855094
deploy rabbitmq cluster by train with ipv6 report:
unable to connect to epmd (port 4369) on control-1: address (cannot connect to host/port)
Closes-Bug: #1856725
Change-Id: I36ebb4e196ece8a304269e8c85e39dda72faae50
Signed-off-by: yj.bai <bai.yongjun@99cloud.net>
Currently External Ceph Cinder config requires the user to create cinder
service custom configuration.
This change alters the if/else statements to template out cinder backends
configuration when cinder_backend_ceph is True.
Change-Id: I143c3b44d2839e56d1dbf28484c0eaae0a753dc9
Ironic provides a feature to allow instance images to be served from a
local HTTP server [1]. This is the same server used for PXE images with
iPXE. This does not work currently because the ironic_ipxe container
does not have access to /var/lib/ironic/images (ironic docker volume),
where the images are cached. Note that to make use of this feature, the
following is required in ironic.conf:
[agent]
image_download_source = http
This change fixes the issue by giving ironic_ipxe container access to
the ironic volume.
[1] https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html#deploy-with-custom-http-servers
Change-Id: I501d02cfd40fbacea32d551c3912640c5661d821
Closes-Bug: #1856194
These are executed on the local host where we run ansible-playbook,
and we have agreed to drop Python 2 support there.
Partially Implements: blueprint drop-py2-support
Change-Id: Id2190c3a22a56f4f048afbf0f7200daa8f41a292
Change Id84e3b6e62e544582d6917047534e846e026798d added support for
custom HAProxy service config using a plain copy of files in services.d.
Use a template action instead of a copy so that we can use variables and
iterate over group of hosts.
Change-Id: I1f07785932de4e4540422bd18af95241f05a67bf
We generate the keystone cron schedule via a python script on localhost.
Currently this always uses 'python', however this may not be available
on some systems.
This change switches to use the same python interpreter as used by
ansible-playbook.
Partially-Implements: blueprint python-3
Change-Id: I6007f8d6880f418a503766cec21a330c44e5b80f
This allows users to supply an Elasticsearch Curator actions file
to manage log retention [1]. Curator then runs on a cron job, which
defaults to every day. A default curator actions file is provided,
which can be customised by the end user if required.
[1] https://www.elastic.co/guide/en/elasticsearch/client/curator/current/actionfile.html
Change-Id: Ide9baea9190ae849e61b9d8b6cff3305bdcdd534
WSGI log files use a different input configuration than OpenStack log
files. Currently this depends on log files matching either *-access.log
or *-error.log. Some services use *_access.log or *_error.log, so are
not parsed correctly.
This change modifies the fluentd configuration to accept an underscore
or hyphen for WSGI log file names.
Change-Id: I566d6cac0b6749054fd5422ec8f36f99dacb1db7
Closes-Bug: #1720371
Enable reconnect_on_error option so that ES plugin re-establishes
a new session to the ES cluster on errors. Also, enable buffering
to the file, so that the buffer survives container restarts.
Co-Authored-By: Michal Nasiadka <mnasiadka@gmail.com>
Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
Co-Authored-By: Doug Szumski <doug@stackhpc.com>
Closes-Bug: #1830724
Change-Id: Ia40685b9d4fc02194e03c8791ddeb3d29d7f07f6
To fix instability and availability issues:
etcd3 is not available in repos for binary kolla images.
etcd3 does not support eventlet-based services [1].
[1] https://review.opendev.org/466098
Change-Id: I430bab735da204fc81696130b17931a89214c876
Closes-bug: #1852086
Closes-bug: #1854932
Currently we don't put global Apache error logs into /var/log/kolla,
this change adds statements that redirect those logs there.
Adapted the logfile names to catch into openstack wsgi logging fluentd
input config and existing logrotate cron entries.
Change-Id: I21216e688a1993239e3e81411a4e8b6f13e138c2
In a deployment where Prometheus is enabled and
Alertmanager is disabled the task "Copying over
prometheus config file" in
'ansible/roles/prometheus/tasks/config.yml' will
fail to template the Prometheus configuration file
'ansible/roles/prometheus/templates/prometheus.yml.j2'
as the variable 'prometheus_alert_rules' does not
contain the key 'files'. This commit fixes this bug.
Change-Id: Idbe1e52dd3693a6f168d475f9230a253dae64480
Closes-Bug: #1854540
Adds support for configuration of the Docker client timeout via
'docker_client_timeout'.
This change also increases the default timeout to 120 seconds, as we
sometimes see timeouts in CI and heavily loaded or underpowered
environments. Increasing 'docker_client_timeout' further may be helpful
in cases where Docker reports 'Read timed out'.
Change-Id: I73745771078cb2c0ebae2b1d87ba2c4c12958d82
Closes-Bug: #1809844