e3705e6046
The service k8s-pod-recovery failed to restore the SRIOV device plugin, necessary for pods that use SRIOV interfaces to create the resource, those pods need to add the label 'restart-on-reboot=true' to be restarted during boot. The failure was observed during an upgrade, and although rare, it left the operator to actuate by manually restarting the pods later. This change adds a wait for the pod stabilization (it is considered stable when stops the state transitions) and, if still in failure, execute 2 attempts to restore the plugin. Logs were added to better register the pod state in case of an error. Test Plan: [PASS] execute 7 upgrades in an AIO-SX lab Closes-Bug: 1999074 Signed-off-by: Andre Fernando Zanella Kantek <AndreFernandoZanella.Kantek@windriver.com> Change-Id: I838c35d3e0a3557c71344945a8e00f22ccb50eb4 |
||
---|---|---|
.. | ||
k8s-pod-recovery | ||
k8s-pod-recovery.service |