Most roles are not leveraging the jinja filters available.
According to [1] filtering the list of services makes the execution
faster than skipping the tasks.
This patchset also includes some cosmetic changes to genconfig.
Individual services are now also using a jinja filter. This has
no impact on performance, just makes the tasks look cleaner.
Naming of some vars in genconfig was changed to "service" to make
the tasks more uniform as some were previously using
the service name and some were using "service".
Three metrics from the deployment were taken and those were
- overall deployment time [s]
- time spent on the specific role [s]
- CPU usage (measured with perf) [-]
Overall genconfig time went down on avg. from 209s to 195s
Time spent on the loadbalancer role went down on avg. from 27s to 23s
Time spent on the neutron role went down on avg from 102s to 95s
Time spent on the nova-cell role went down on avg. from 54s to 52s
Also the average CPUs utilized reported by perf went down
from 3.31 to 3.15.
For details of how this was measured see the comments in gerrit.
[1] - https://github.com/stackhpc/ansible-scaling/blob/master/doc/skip.md
Change-Id: Ib0f00aadb6c7022de6e8b455ac4b9b8cd6be5b1b
Signed-off-by: Roman Krček <roman.krcek@tietoevry.com>
If rabbitmq is not on the same host as the nova-controller,
then this task will fail. This change ensures that the
task references an actual rabbitmq host vs the host the
task runs on.
Closes-Bug: 2020805
Change-Id: I1b58f4aeda8c9fe8db1770c63c17bf1c465f3d2a
If the container image used by Mariabackup is different than the
one used by MariaDB server, it's possible that mariabackup and mariadb
are incompatible. This may cause backup operations to fail.
This change queries the running MariaDB server container's image and
uses it when taking a backup. If MariaDB server isn't running on the
host it falls back to the image defined in configuration.
The separate mariabackup_image, mariabackup_tag and
mariabackup_image_full variables are no longer required and have been
removed.
Closes-Bug: #2058644
Change-Id: I45f3f90ec1973dae92131ea16a7b248ab7a8ae69
Update Sykline stop task to use the
service-stop role to symplify the task
and make sure it is using kolla_container.
Authored-By: Roman Krček <roman.krcek@tietoevry.com>
Change-Id: I7b11359cee931273a058364160b64fe1fb606b5e
This will fix exit codes - details in bug.
Basically openvswitch treats TERM as something fatal (exits
with 143) and the only solution for graceful exit is using
ovs-appctl and sending exit command, just like ovs-ctl utility
does.
Depends-On: https://review.opendev.org/c/openstack/kolla/+/905189
Partial-Bug: #2048130
Change-Id: I523018cb98944de60d7b95404deb7cebd641f33a
After Neutron policy changes - Octavia jobs started
to fail on cascade LB deletion due to Neutron user
not having service role.
Closes-Bug: #2065337
Change-Id: I616bf3a3dbb4d963665b1621a9e5e9d417b13942
sometimes cluster recovery didn't work
because we only look for the sequence number in the last 200 lines
of the log file.
fix this by ingesting the complete file and only register the last
sequence number we find.
Closes-Bug: 1821173
Change-Id: Iea2661c9d5d262cf99edd5f5b567f252607a0003
Signed-off-by: Sven Kieske <kieske@osism.tech>