deployments
This allows services to work with etcd when coordination is enabled
for TLS internal deployments. Without this fix, we fail to connect to
etcd with the coordination backend and the service itself crashes.
Change-Id: I0c1d6b87e663e48c15a846a2774b0a4531a3ca68
For Masakari and HACluster to work properly, the hostnames used
in HACluster need to match with the hostnames used in Nova.
Change-Id: Iac917ef4471905caab591cd64eab379e150a8524
Previously, when running one of the following commands:
kolla-ansible deploy --check
kolla-ansible genconfig --check
deployment or configuration generation fails for various reasons.
MariaDB fails to lookup the existing cluster.
Keystone fails to generate cron config.
Nova-cell fails to get the cell settings.
Closes-Bug: #2002661
Change-Id: I5e765f498ae86d213d0a4379ca5d473db1499962
Currently we do not follow the RabbitMQ advice on replicas here:
https://www.rabbitmq.com/ha.html#replication-factor
Here we reduce the number of replicas to n // 2 + 1 as advised
above. The hope it this helps speed up recovery from rabbit
issues.
Related-Bug: #1954925
Change-Id: Ib6bcb26c499c9884faa4a0cd51abaec00cacb096
Adds the flag `rabbitmq_ha_replica_count` to change how many different
nodes a queue should be mirrored across. If the value is not set, then
it defaults to "ha-mode":"all". This value is unset by default to avoid
any unexpected changes to the RabbitMQ definitions.json file, as that
would trigger an unexpected restart of RabbitMQ during the next deploy.
Change-Id: Iee98cd937197a73a3b04aa8501fa325e8ecfff24
etcd-compatible tooz drivers do not support multiple endpoints via
backend_url. We can put a loadbalancer in front of etcd and configure
backend_url to use the VIP instead. The issue with hard coding the first
host is that we break coordination if we take this host offline. In the
case of cinder, we would not be able to perform any volume related
operations.
Co-Authored-By: Mark Goddard <mark@stackhpc.com>
Change-Id: Ib684501ba03c386dc5ac71e5cbea05c99f191665
By default ha-promote-on-shutdown=when-synced. However we are seeing
issues with RabbitMQ automatically recovering when nodes are restarted.
https://www.rabbitmq.com/ha.html#cluster-shutdown
Rather than waiting for operator interventions, it is better we allow
recovery to happen, even if that means we may loose some messages.
A few failed and timed out operations is better than a totaly broken
cloud. This is achieved using ha-promote-on-shutdown=always.
Note, when a node failure is detected, this is already the default
behaviour from 3.7.5 onwards:
https://www.rabbitmq.com/ha.html#promoting-unsynchronised-mirrors
This patch adds the option to change the ha-promote-on-shutdown
definition, using the flag `rabbitmq_ha_promote_on_shutdown`. This
value is unset by default to avoid any unexpected changes to the
RabbitMQ definitions.json file, as that would trigger an unexpected
restart of RabbitMQ during the next deploy.
Related-Bug: #1954925
Change-Id: I2146bda2c72ddac2c9923c6941b0596395fd9ab5
This patch add connection local for above mentioned task as
kolla-ansible can be executed in docker container as in
my case.
When there is no connection: local, ansible is trying to connect
to localhost via ssh where specified python script is not available.
After connection: local everything is working as expected as file
is found inside container
Closes-Bug: #2004224
Change-Id: I219a958b4f101efb71a2935e6d910dae5c65f0be
neutron_tls_proxy and glance_tls_proxy are using haproxy container
image. Pin them to haproxy_tag directly.
Change-Id: I73142db48ebe6641520d21b560f16de892e07c34
This change serialises the neutron l3 agent restart process and adds a
user configurable delay between restarts. This can prevent connectivity
loss due to all agents being restarted at the same time.
Routers increase the recovery time, making this issue more prevalent.
Change-Id: I3be0ebfa12965e6ae32d1b5f13f8fd23c3f52b8c
In order to honour configured max number of attempts
it has to be presented in nova.conf inside of
nova_conductor container, otherwise the default value
of 3 will be used
Closes-Bug: #2003587
Change-Id: I928af332b8658223444594f96417830233057284
This commit adds SystemdWorker class to kolla_docker ansible module.
It is used to manage container state via systemd calls.
Change-Id: I20e65a6771ebeee462a3aaaabaa5f0596bdd0581
Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Signed-off-by: Martin Hiner <m.hiner@partner.samsung.com>
As rabbitmq's configuration file is not ini or yaml file,
there is no option to extend configuration by new config
options via merge_configs or merge_yaml.
This patch moves config options to dictionary
so it can be overriden in /etc/kolla/globals.yml.
Change-Id: I5cd772f4fb80a0e200fb24d67be735ca81e3fdeb