Rerunning the overcloud deploy command with no changes restarts a
truckload of containers (first seen this via
https://bugzilla.redhat.com/show_bug.cgi?id=1612960). So we really have
three separate issues here. Below is the list of all the containers that
may restart needlessly (at least what I have observed in my tests):
A) cron category:
ceilometer_agent_notification cinder_api cinder_api_cron cinder_scheduler
heat_api heat_api_cfn heat_api_cron heat_engine keystone keystone_cron
logrotate_crond nova_api nova_api_cron nova_conductor nova_consoleauth
nova_metadata nova_scheduler nova_vnc_proxy openstack-cinder-volume-docker-0
panko_api
These end up being restarted because in the config volume for the container there is
a cron file and cron files are generated with a timestamp inside:
$ cat /var/lib/config-data/puppet-generated/keystone/var/spool/cron/keystone
...
# HEADER: This file was autogenerated at 2018-08-07 11:44:57 +0000 by puppet.
...
The timestamp is unfortunately hard coded into puppet in both the cron provider and the parsedfile
provider:
https://github.com/puppetlabs/puppet/blob/master/lib/puppet/provider/cron/crontab.rb#L127https://github.com/puppetlabs/puppet/blob/master/lib/puppet/provider/parsedfile.rb#L104
We fix this by repiping tar into 'tar xO' and grepping away any line
that starts with # HEADER.
B) swift category:
swift_account_auditor swift_account_reaper swift_account_replicator
swift_account_server swift_container_auditor swift_container_replicator
swift_container_server swift_container_updater swift_object_auditor
swift_object_expirer swift_object_replicator swift_object_server
swift_object_updater swift_proxy swift_rsync
So the swift containers restart because when recalculating the md5 over the
/var/lib/config-data/puppet-generated/swift folder we also include:
B.1) /etc/swift/backups/... which is a folder which over time collects backup of the ringfiles
B.2) /etc/swift/*.gz it seems that the *.gz files seem to change over time
We just add a parameter to the tar command to exclude those files as
we do not need to trigger a restart if those files change.
--exclude='*/etc/swift/backups/*' --exclude='*/etc/swift/*.gz'
C) libvirt category:
nova_compute nova_libvirt nova_migration_target nova_virtlogd
This one seems to be due to the fact that the /etc/libvirt/passwd.db file contains a timestamp and
even when we disable a user and passwd.db does not exist, it gets
created:
[root@compute-1 nova_libvirt]# git diff cb2441bb1caf7572ccfd870561dcc29d7819ba04..0c7441f30926b111603ce4d4b60c6000fe49d290 .
passwd.db changes do not need to trigger a restart of the container se
we can safely exclude this file from any md5 calculation.
Part C) was: Co-Authored-By: Martin Schupper <mschuppe@redhat.com>
We only partial-bug this one because we want a cleaner fix where
exceptions to the files being checksummed will be specified in the tht
service files.
Partial-Bug: #1786065
Tested as follows:
./overcloud_deploy.sh
tripleo-ansible-inventory --static-yaml-inventory inv.yaml
ansible -f1 -i inv.yaml -m shell --become -a "docker ps --format=\"{{ '{{' }}.Names{{ '}}' }}: {{ '{{' }}.CreatedAt{{ '}}' }}\" | sort" overcloud > before
./overcloud_deploy.sh
ansible -f1 -i inv.yaml -m shell --become -a "docker ps --format=\"{{ '{{' }}.Names{{ '}}' }}: {{ '{{' }}.CreatedAt{{ '}}' }}\" | sort" overcloud > after
diff -u before after | wc -l
0
Change-Id: I10f5cacd9fee94d804ebcdffd0125676f5a209c4
Due to [1] ansible always access servers lowcase. Also, in respect to [2], this
patch lowercase name which is use in fqdn, hostname, ssh_known_hosts and other
places.
[1] aa4278e5f3
[2] https://tools.ietf.org/html/rfc4343
Change-Id: Ib25832496d6504def436414b9c2903cbfe5854d4
Resolves: rhbz#1619556
When deploying with podman, we need to create directories if they don't
exist before trying to mount them later when containers are starting.
Otherwise, podman fails with this kind of error:
error checking path \"/etc/iscsi\": stat /etc/iscsi: no such file or directory"
Change-Id: I7dbdc7f3646dda99c8014b4c8ca2edd48778b392
This patch passes RpcPort parameter value to container health check
scripts, which are based on verifying if service is connected to RabbitMQ.
Change-Id: If63f136b5173bb9a94572ea5062a188469c2c782
Closes-Bug: #1782369
ml2_conf.ini shoudn't be used in neutron-ovs-agent
Some parameters can be in conflict and overwrite
each other eg firewall_driver. Using openvswitch_agent
is enought to configure correct agent.
Change-Id: I815cb67fd4ea9ad98347d6d6bbcc9bcf01113649
Closes-Bug: 1789549
The ssh client no longer appears to accept the regular known hosts entry when
the target is running on a non-default port.
Adding '[host]:*' should fix this, regardless of the port.
However this does not work for the default port so we must include both.
Change-Id: I519ff6053676870dff1bdff60fb1f6b2aa5ee8c9
Closes-bug: #1789452
Long time ago it was required because ironicclient defaulted to
an ancient version. It was changed back in Queens (ironicclient 2.0),
so we can drop it now to avoid confusion.
Change-Id: Icea0bdf6d5dcdd81ce9c34be7af8a241da0861bc
Closes-Bug: #1789392
Core/Ram/Disk Filters are not required when using filter_scheduler.
After https://review.openstack.org/#/c/565841 when using these
Filters nova is not scheduling to the ironic nodes and overcloud
deployment fails.
For now just testing the undercloud, good to see what scheduler/filters
are being enabled in overcloud and reflect there as well.
Related-Bug: #1787910
Depends-On: Ia82f1c6be0d5504498e77a90268cad8abecdeae2
Change-Id: I0e376d99adeaa318118833018be81491c6b14095
The unicode function no longer exists in python3 so let's just designate
the string as unicode since we're doing replacement in bash anyway.
Change-Id: I3226a3a16eec711097c30929946cb2d36646c4cc
Related-Blueprint: python3-support
The ServiceNetMap contains an incorrect entry for the SnmpdNetwork.
The entry "ctrlplane" should be "ctlplane".
Change-Id: I6c8ab952e364e8fc643e291388b7f13615a1df3e
We don't need to mount /usr/share/neutron, the directory is provided in
openstack-neutron rpm, so we don't need to manage this directory. It
should be in all neutron containers, including the neutron_db_sync.
Change-Id: I6f71ce62b1c5f3de175d7a50ee7229d3047a379a
This is a mechanically generated patch to complete step 1 of moving
the zuul job settings out of project-config and into each project
repository.
Because there will be a separate patch on each branch, the branch
specifiers for branch-specific jobs have been removed.
Because this patch is generated by a script, there may be some
cosmetic changes to the layout of the YAML file(s) as the contents are
normalized.
See the python3-first goal document for details:
https://governance.openstack.org/tc/goals/stein/python3-first.html
Change-Id: Idb328be9749bb0aa1d8e8ac748fefce962829928
Story: #2002586
Task: #24341
Currently we instantiate a novaclient.client Client object without explicitely
passing any endpoint_type in kwargs. The Client object defaults to using
'publicURL': https://github.com/openstack/python-novaclient/blob/stable/queens/novaclient/client.py#L116
In some environments the access to publicURL is not desired and likely the wrong default.
So this needs to be a) configureable and b) default to internalURL when nothing is specified.
We make this configurable by leveraging the os_interface key in the
placement section of nova.conf as that is what specifies the endpoint
type since ocata: https://docs.openstack.org/releasenotes/nova/ocata.html#other-notes
We also check for the existance of the [placement]/valid_interface key
and will use that instead if it is present as it is the proper
recommended way to get this information as of queens (see
https://review.openstack.org/#/c/492247/). Since it is a list
of preferred endpoint URLs, we take the first one.
Tested by making sure via tcpdump that the internal_url was being hit
after restarting the nova_compute container with the patched code:
(overcloud) [stack@undercloud-0 ~]$ openstack endpoint list |grep comput
| 8ad225f34170467a84513c5b447662dc | regionOne | nova | compute | True | admin | http://172.17.1.16:8774/v2.1 |
| 9a15e824601f43629b03ec99589c3d83 | regionOne | nova | compute | True | internal | http://172.17.1.16:8774/v2.1 |
| c5b964700daf4abfac5060432debdbe3 | regionOne | nova | compute | True | public | https://10.0.0.101:13774/v2.1 |
[root@compute-0 ~]# tcpdump -i any -nn host 172.17.1.16 and port 8774
09:29:57.824687 IP 172.17.1.10.37254 > 172.17.1.16.8774: Flags [S], seq 3520534439, win 29200, options [mss 1460,sackOK,TS val 564789919 ecr 0,nop,wscale 7], length 0
09:29:57.824946 ethertype IPv4, IP 172.17.1.16.8774 > 172.17.1.10.37254: Flags [S.], seq 3844540290, ack 3520534440, win 28960, options [mss 1460,sackOK,TS val 564810385 ecr 564789919,nop,wscale 7], length 0
09:29:57.824946 IP 172.17.1.16.8774 > 172.17.1.10.37254: Flags [S.], seq 3844540290, ack 3520534440, win 28960, options [mss 1460,sackOK,TS val 564810385 ecr 564789919,nop,wscale 7], length 0
Change-Id: Ifbb40e2a2222c229fd71eca2c4c36daa448e492d
Closes-Bug: #1788584