In the same spirit as change I1f07272499b419079466cf9f395fb04a082099bd
we want to rerun all pacemaker _init_bundles all the time. For a few main
reasons:
1) We will eventually support scaling-up roles that contain
pacemaker-managed services and we need to rerun _init_bundles so that
pacemaker properties are created for the newly added nodes.
2) When you replace a controller the pacemaker properties will be
recreated for the newly added node.
3) We need to create appropriate iptables rules whenever we add a
service to an existing deployment.
We do this by adding the DeployIdentifier to the environment so that
paunch will retrigger a run at every redeploy.
Partial-Bug: #1775196
Change-Id: Ifd48d74507609fc7f4abc269b61b2868bfbc9272
OpenDaylight creates multiple files the first time it boots, which we do
not mount to the host. After the first boot, it creates a cache which we
do mount to the host. This means that on a config change or
update/upgrade of ODL the cache will not be removed, but the files will
be. This causes ODL to fail to start.
The solution is to stop the container in update/upgrade and then remove
the cache before the update happens. This will trigger the new ODL to
rebuild the cache with the new ODL version. For config change, we also
need to remove the cache in the host_prep_tasks so that we do not end up
in a similar state.
Closes-Bug: 1775919
Change-Id: Ia457b90b765617822e9adbf07485c9ea1fe179e5
Signed-off-by: Tim Rozet <trozet@redhat.com>
During the containerization work we regressed on the restart of
pacemaker resources when a config change for the service was detected.
In baremetal we used to do the following:
1) If a puppet config change was detect we'd touch a file with the
service name under /var/lib/tripleo/pacemaker-restarts/<service>
2) A post deployment bash script (extraconfig/tasks/pacemaker_resource_restart.sh)
would test for the service file's existence and restart the pcs service via
'pcs resource restart --wait=600 service' on the bootstrap node.
With this patchset we make use of paunch's ability do detect if a config
hash change happened to respawn a temporary container (called
<service>_restart_bundle) which will simply always restart the pacemaker
service from the bootstrap node whenever invoked, but only if the pcmk
resource already exists. For this reason we add config_volume and bind
mount it inside the container, so that the TRIPLEO_CONFIG_HASH env
variable gets generated for these *_restart_bundle containers.
We tested this change as follows:
A) Deployed an HA overcloud with this change and observed that pcmk resources
were not restarted needlessly during initial deploy
B) Rerun the exact same overcloud deploy with no changes, observed that
no spurious restarts would take place
C) Added an env file to trigger the of config of haproxy[1], redeployed and observed that it restarted
haproxy only:
Jun 06 16:22:37 overcloud-controller-0 dockerd-current[15272]: haproxy-bundle restart invoked
D) Added a trigger [2] for mysql config change, redeployed and observed restart:
Jun 06 16:40:52 overcloud-controller-0 dockerd-current[15272]: galera-bundle restart invoked
E) Added a trigger [3] for a rabbitmq config change, redeployed and observed restart:
Jun 06 17:03:41 overcloud-controller-0 dockerd-current[15272]: rabbitmq-bundle restart invoked
F) Added a trigger [4] for a redis config change, redeployed and observed restart:
Jun 07 08:42:54 overcloud-controller-0 dockerd-current[15272]: redis-bundle restart invoked
G) Rerun a deploy with no changes and observed that no spurious restarts
were triggered
[1] haproxy config change trigger:
parameter_defaults:
ExtraConfig:
tripleo::haproxy::haproxy_globals_override:
'maxconn': 1111
[2] mysql config change trigger:
parameter_defaults:
ExtraConfig:
mysql_max_connections: 1111
[3] rabbitmq config change trigger (default partition handling is 'ignore'):
parameter_defaults:
ExtraConfig:
rabbitmq_config_variables:
cluster_partition_handling: 'pause_minority'
queue_master_locator: '<<"min-masters">>'
loopback_users: '[]'
[4] redis config change trigger:
parameter_defaults:
ExtraConfig:
redis::tcp_backlog: 666
redis::params::tcp_backlog: 666
Change-Id: I62870c055097569ceab2ff67cf0fe63122277c5b
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Closes-Bug: #1775196
There was a typo in the update_tasks for Manila which was causing
updates and upgrades to fail. This patch fixes the typo.
Closes-Bug: 1775667
Change-Id: I88dd16fa94111a4eb56aeaa32b560cf7d12b9f82
This ensures the docker service on the openshift nodes is able to pull
from a local registry if configured this way.
Change-Id: Ifd48b2e6500b10d108985a4a9f1d73493d404134
Depends-On: I31494ff8524b90343e6e8c67bd08a354837ecc45
* Since $@ parameter may have pipe '|' it should be processed correctly.
Currenly the part before pipe is assigned as $@ so bash runs pipe with
commands after it after execution of function. However, we want to assign
command with pipe to $@ thus "" around command with pipe are required.
* replace $() with eval as $() doesn't work correctly with pipe as it tries to
escape pipe so output variable contains wrong data.
* This patch adds tonumber to first invocation
Change-Id: I958e14c0a4ea4b5782d2c74dc895471b0f70b875
xinetd.service is not installed on pre-provisioned nodes,
so we'll add extra check for its restart.
Change-Id: I4fbac81ceb4aba534395cf8c0a842fb732559234
Closes-Bug: 1775154
In the event you have different disks in nodes assigned to each role
you may need to pass role specific parameters, e.g like:
parameter_defaults:
OpenShiftMasterParameters:
OpenShiftGlusterDisks:
- /dev/vdc
OpenShiftWorkerParameters:
OpenShiftGlusterDisks:
- /dev/vdd
To enable that we create an inventory file per role, and pass the directory
of files to ansible.
Change-Id: I8b4d8698405ffb004b081e1f097f300216edfa77
The goal is to be able to point the Gnocchi file driver directory
to an NFS share.
A new parameter GnocchiFileBasePath allows to customize
the bind mount to /var/lib/gnocchi.
Change-Id: I868a368161f4a529e5e7dc3593dc6862e3196247
We need KernelIpNonLocalBind on the undercloud to bind non local ips
among other ip forward options. This sysctl parameter was managed by
instack-undercloud but never ported to the containerized undercloud.
We need the same sysctl parameters for parity with non containerized
undercloud.
Change-Id: Idd3d432b8f7eb573d94cd56be8e05614510ebddf
Related-Bug: #1774898
Add an OPNFV scenario environment that uses ODL for overcloud
networking and OVS for virthost networking.
Depends-On: I33602ac5521c4f059c1a0d08e3e828fb64d3c817
Depends-On: Ib7968c46a59f266c20628c36178d2235ad833915
Depends-On: I37405e41ec0f85249cef87c09c966cbe0f9baddf
Change-Id: If1f476bb933106456df3568978b4555dde190621
Modify both the inspector and dnsmasq containers for the inspector to be
able to modify dnsmasq configuration on the fly to filter the dhcp
traffic.
The upgrade_tasks moved to the puppet service in order to be shared
between both the containerised and regular deployment. The upgrade_tasks
were amended with steps to clean-up the iptables inspector chain&rules.
With inspector no longer managing iptables rules, create new rules to
allow DHCP traffic on IronicInspectorInterface.
Co-Authored-By: Harald Jensås <hjensas@redhat.com>
Change-Id: Ic7e32acb8559a7a12cd8767dc68c343872a6a4e3
Depends-On: I056cdadc025f35d8b6fd22f510a7c0a8e259a1f0
We don't expect our operators to have SSH keys setup on the undercloud
node, so we don't want to block the PasswordAuthentication in
sshd_config.
Depends-On: I88b24c82fb3cf2309f45d5d447a9b0c403da7fc9
Change-Id: I10b112e8bffff30879606ddd970dfd3ec67fd9c7
Closes-Bug: #1772519
This patch adds the required parameters to the Compute role so the
agents are configured properly on upgrade.
Related-Bug: #1774199
Change-Id: Iab42ae0fb13e8e92cc9903432a95e04a94a5913c
To trigger ceph-ansible we need to make sure the WorkfowSteps
resource is enabled in ceph-upgrade-prepare env file.
Change-Id: Id760305971a68c397f9334265dd023b1e1884295
Closes-Bug: 1774647
We currently create /var/lib/docker-puppet/docker-puppet.sh
inside the mp_puppet_config() function which then gets
invoked in parallel via the following:
p = multiprocessing.Pool(process_count)
returncodes = list(p.map(mp_puppet_config, process_map))
This is problematic because we have the following potential race:
1) Process A opens /var/lib/docker-puppet/docker-puppet.sh for writing
2) Process B runs docker run and has the following bind mount:
/var/lib/docker-puppet/docker-puppet.sh:/var/lib/docker-puppet/docker-puppet.sh:z
3) Process B will fail because an exec of a file being written to
will return ETXTBSY
The deployment can fail due to the above with the following error:
[root@overcloud-controller-2 ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a401108cd827 192.168.24.1:8787/tripleoqueens/centos-binary-glance-api:current-tripleo-rdo "/var/lib/docker-p..." 19 minutes ago Exited (1) 19 minutes ago docker-puppet-glance_api
[root@overcloud-controller-2 ~]# docker logs docker-puppet-glance_api
standard_init_linux.go:178: exec user process caused "text file busy"
Since /var/lib/docker-puppet/docker-puppet.sh never changes
there is really no need to create it multiple times. Let's just
create it once before spawning the multiple docker run commands
so we avoid any ETXTBSY errors.
Ran 20 successful deployments in sequence with this change applied.
Change-Id: I16b19488ce9f1411273459576db76d16b318dacb
Closes-Bug: #1760787