Kolla-Ansible populates /etc/hosts with overcloud hosts using their API
interface IP address. When configured correctly, this allows Nova to use
the API interface for live migration of instances between compute hosts.
The hostname used is from the `ansible_hostname` variable, which is a
short hostname generated by Ansible using the first dot as a delimiter.
However, Nova defaults to use the result of socket.gethostname() to
register nova-compute services.
In deployments where hostnames are set to FQDNs, for example when using
FreeIPA, nova-compute would try to reach the other compute node using
its FQDN (as registered in the Nova database), which was absent from
/etc/hosts. This can result in failures to live migrate instances if
DNS entries don't match.
This commit populates /etc/hosts with `ansible_nodename` (hostname as
reported by the system) in addition to `ansible_hostname`, if they are
different.
Change-Id: Id058aa1db8d60c979680e6a41f7f3e1c39f98235
Closes-Bug: #1830023
Similar to what we did here: https://review.opendev.org/#/c/655276 but,
for ceilometer/data/meters.d/meters.yaml file.
The idea is to create a method for operators to manage custom meters
YAML files via Kolla-ansible. To do that, we enable them (operators)
to use a folder called by default "meters.d" in their local
ceilometer configurations, where all of the custom meters YAML files
will be read from. If this folder exist and has YAML files in it, we
copy them for the default "/etc/ceilometer/meters.d" path in the
containers. We do not inject things in the container though. We copy
the files for the control node, and then we map them via
ceilometer*.json container configuration files.
Change-Id: I712edcf39bfdb64887e25437f0aff30a45a829dd
Signed-off-by: Rafael Weingärtner <rafael@apache.org>
The etc_examples and inventory should be copied from the virtual
environment rather than the system.
Change-Id: I3ac1e057971b7481a0bce2a15351031e51bf97d6
Closes-Bug: #1829435
backport: stein, rocky
During startup of nova-compute, we see the following error message:
Error gathering result from cell 00000000-0000-0000-0000-000000000000:
DBNotAllowed: nova-compute
This issue was observed in devstack [1], and fixed [2] by removing
database configuration from the compute service.
This change takes the same approach, removing DB config from nova.conf
in the nova-compute* containers.
[1] https://bugs.launchpad.net/devstack/+bug/1812398
[2] 8253787137
Change-Id: I18c99ff4213ce456868e64eab63a4257910b9b8e
Closes-Bug: #1829705
Add the ability to Kolla-ansible to manage the 'max_workers' parameter
in Cloudkitty. We will use the 'openstack_service_workers' variable
to control the number of workers that Cloudkitty is able to use.
Change-Id: I2f4e7e5c45d71a7e01d1b743d2eb4850cc339419
Signed-off-by: Rafael Weingärtner <rafael@apache.org>
Right now every controller rotates fernet keys. This is nice because
should any controller die, we know the remaining ones will rotate the
keys. However, we are currently over-rotating the keys.
When we over rotate keys, we get logs like this:
This is not a recognized Fernet token <token> TokenNotFound
Most clients can recover and get a new token, but some clients (like
Nova passing tokens to other services) can't do that because it doesn't
have the password to regenerate a new token.
With three controllers, in crontab in keystone-fernet we see the once a day
correctly staggered across the three controllers:
ssh ctrl1 sudo cat /etc/kolla/keystone-fernet/crontab
0 0 * * * /usr/bin/fernet-rotate.sh
ssh ctrl2 sudo cat /etc/kolla/keystone-fernet/crontab
0 8 * * * /usr/bin/fernet-rotate.sh
ssh ctrl3 sudo cat /etc/kolla/keystone-fernet/crontab
0 16 * * * /usr/bin/fernet-rotate.sh
Currently with three controllers we have this keystone config:
[token]
expiration = 86400 (although, keystone default is one hour)
allow_expired_window = 172800 (this is the keystone default)
[fernet_tokens]
max_active_keys = 4
Currently, kolla-ansible configures key rotation according to the following:
rotation_interval = token_expiration / num_hosts
This means we rotate keys more quickly the more hosts we have, which doesn't
make much sense.
Keystone docs state:
max_active_keys =
((token_expiration + allow_expired_window) / rotation_interval) + 2
For details see:
https://docs.openstack.org/keystone/stein/admin/fernet-token-faq.html
Rotation is based on pushing out a staging key, so should any server
start using that key, other servers will consider that valid. Then each
server in turn starts using the staging key, each in term demoting the
existing primary key to a secondary key. Eventually you prune the
secondary keys when there is no token in the wild that would need to be
decrypted using that key. So this all makes sense.
This change adds new variables for fernet_token_allow_expired_window and
fernet_key_rotation_interval, so that we can correctly calculate the
correct number of active keys. We now set the default rotation interval
so as to minimise the number of active keys to 3 - one primary, one
secondary, one buffer.
This change also fixes the fernet cron job generator, which was broken
in the following cases:
* requesting an interval of more than 1 day resulted in no jobs
* requesting an interval of more than 60 minutes, unless an exact
multiple of 60 minutes, resulted in no jobs
It should now be possible to request any interval up to a week divided
by the number of hosts.
Change-Id: I10c82dc5f83653beb60ddb86d558c5602153341a
Closes-Bug: #1809469
Before making changes to this script, document its behaviour with a unit
test.
There are two major issues:
* requesting an interval of more than 1 day results in no jobs
* requesting an interval of more than 60 minutes, unless an exact
multiple of 60 minutes, results in no jobs
Change-Id: I655da1102dfb4ca12437b7db0b79c9a61568f79e
Related-Bug: #1809469
When integrating 3rd party component into openstack with kolla-ansible,
maybe have to mount some extra volumes to container.
Change-Id: I69108209320edad4c4ffa37dabadff62d7340939
Implements: blueprint support-extra-volumes
Cloudkitty has a default (built-in the container) metrics.yml file
in the /etc/cloudkitty/metrics.yml files. We would like to be able
to overwrite/customize these metrics configurations via kolla-ansible.
Cloudkitty is able to use a custom metric file via "metrics_conf".
Therefore, we are enabling this configuration via Kolla-ansible.
Change-Id: Id9019298482c040be05f540e71dacfdf0bd77469
Signed-off-by: Rafael Weingärtner <rafael@apache.org>
When using the the default domain name there are issues authenticating
with Keystone. For example, you can only log in on the second attempt
and the Monasca datasource fails to authenticate. Switching to the
default domain id resolves these issues.
Change-Id: I2cb4b2608c74dd853c97e4fc27078930bc72fdf8
The flush_handlers clause doesn't honour conditional clauses.
Instead, it prints a warning and runs anyway:
[WARNING]: flush_handlers task does not support when conditional
See: https://github.com/ansible/ansible/pull/41126
TrivialFix
Change-Id: Iaf70c2e932ae6dfb723bdb2ba658acdbfe74ebe2
This fixes a deprecation warning that gets displayed when running
the kibana/post_config 'Get kibana default indexes' task.
HEADERS_ has been deprecated since ansible 2.1 and will be
removed in 2.9.
https://docs.ansible.com/ansible/latest/modules/uri_module.html
TrivialFix
Change-Id: I177113c606119505c6cb69c66a326f7cbdaf2196
Since Ansible 2.5, the use of jinja tests as filters has been
deprecated.
I've run the script provided by the ansible team to 'fix' the
jinja filters to conform to the newer syntax.
This fixes the deprecation warnings.
Change-Id: I844ecb7bec94e561afb09580f58b1bf83a6d00bd
Closes-bug: #1827370
'Bootstrapping' was spelt with one p - added the second p so the
word becomes a verb nicely.
TrivialFix
Change-Id: I126a5c253408af70d6d0a3be6e59270f385a00e3
glance_upgrading variable is defined only during rolling upgrades, but
its value is checked on every call to "Restart glance-api container" handler.
Closes-Bug: #1826511
Change-Id: Idc95306c50a09666ac0a12e975307a42aef9e352
By default, Ceilometer uses gnocchi_resources.yaml as cfg_file that defines
the metric archive policy and metrics send to gnocchi. Users may want to define
their own strategy.
Change-Id: I49ba34588101ac2b4f450067c8c9a354134063bb
Signed-off-by: Ning Yao <yaoning@unitedstack.com>
20 seconds may be too short to wait for grafana ready, Let's keep the
check task 60 seconds.
backport: rocky
Change-Id: Ib219ad215d1ef2147ba3591f8c398feb4f3c8888
Closes-Bug: #1821285
Add a possibility to mount sources as volumes to containers,
in "more than documentation" way. That will let us to use kolla
as a replacement for devstack.
Partially implements: blueprint mount-sources
Change-Id: I4868ed6829bd037e1012d1f40c4a1d1b9995bf95
This change ensures that URLs returned from these services reference
the HAProxy endpoint, rather than the host on which the service is
running.
Closes-Bug: #1825150
Change-Id: I7f966ff749ea37620f1bde7019a598cb9505fa45
Periodic jobs don't have zuul.change defined, since there is no change
being tested. This causes an early failure when referencing zuul.change
to set the image tag for built images. In periodic jobs we'll never need
to build images because there is no dependent kolla change under test.
Change-Id: I6d9d81cf17b7d0d7aaf87cd96418c904c46681f2