Following ideas here:
https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit
Make sure old messages with no consumer are dropped after the message
TTL of 10 mins, longer than the 1 min RPC timeout.
Also ensure queues expire after an hour of inactivity, so queues from
removed nodes or renamed nodes don't grow over time.
Change-Id: Ifb28ac68b6328adb604a7474d01e5f7a47b2e788
Adds two new flags to alter behaviour in RabbitMQ:
* `rabbitmq_message_ttl_ms`, which lets you set a TTL on messages.
* `rabbitmq_queue_expiry_ms`, which lets you set an expiry time on queues.
See https://www.rabbitmq.com/ttl.html for more information on both.
Change-Id: I51ca37ffbb1bb5c07f2d39873f0f33ca20263f2a
deployments
This allows services to work with etcd when coordination is enabled
for TLS internal deployments. Without this fix, we fail to connect to
etcd with the coordination backend and the service itself crashes.
Change-Id: I0c1d6b87e663e48c15a846a2774b0a4531a3ca68
CentOS Storage SIG rpms have a recommended install
section that installs podman - let's stop doing that.
Ceph is also suffering from the enormous open files
ulimit that EL9 defaults to - let's set a default
in docker engine for now.
Change-Id: I41f39f520dfecec307ad3b86e1e0363570198e42
Fourth part of patchset:
https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
which was suggested to be split into smaller patches.
This commit refactors select methods from DockerWorker class
into ContainerWorker class. New class contains Docker independent
methods also used in Podman introduction and is inteded as a
parent class for specific worker classes.
Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Co-authored-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: I2dd5920410dda053f2dfedc4e2666c56b1a7095a
For Masakari and HACluster to work properly, the hostnames used
in HACluster need to match with the hostnames used in Nova.
Change-Id: Iac917ef4471905caab591cd64eab379e150a8524
Previously, when running one of the following commands:
kolla-ansible deploy --check
kolla-ansible genconfig --check
deployment or configuration generation fails for various reasons.
MariaDB fails to lookup the existing cluster.
Keystone fails to generate cron config.
Nova-cell fails to get the cell settings.
Closes-Bug: #2002661
Change-Id: I5e765f498ae86d213d0a4379ca5d473db1499962
Currently we do not follow the RabbitMQ advice on replicas here:
https://www.rabbitmq.com/ha.html#replication-factor
Here we reduce the number of replicas to n // 2 + 1 as advised
above. The hope it this helps speed up recovery from rabbit
issues.
Related-Bug: #1954925
Change-Id: Ib6bcb26c499c9884faa4a0cd51abaec00cacb096
Adds the flag `rabbitmq_ha_replica_count` to change how many different
nodes a queue should be mirrored across. If the value is not set, then
it defaults to "ha-mode":"all". This value is unset by default to avoid
any unexpected changes to the RabbitMQ definitions.json file, as that
would trigger an unexpected restart of RabbitMQ during the next deploy.
Change-Id: Iee98cd937197a73a3b04aa8501fa325e8ecfff24