Wait for rook-ceph to be applied in factory backup stage

If rook-ceph is configured in a standalone system, during a factory
install the backup script fails when it tries to create the ceph
backup, since the rook-ceph application is not yet applied.

This commit updates the 10-system-backup script to wait for the
rook-ceph application to reach the "applied" state before running the
backup.

When rook-ceph is configured, this check adds between 7 to 11 minutes
to the factory-install process, depending on how long rook-ceph takes
to reach the applied state.

Test Plan:
PASS: Run a factory-install in a standalone host with rook-ceph
configured. Verify that the setup backup waits until the app is applied
and the factory-install completes successfully.
PASS: Run a factory-install in a standalone host without rook-ceph
configured. Verify that the backup stage does not wait for the app.

Closes-bug: 2125236

Change-Id: I67b18426b1acc3f3389f2aae71a93b7bb0a66175
Signed-off-by: Enzo Candotti <Enzo.Candotti@windriver.com>
This commit is contained in:
Enzo Candotti
2025-09-19 18:55:17 -03:00
parent 2c03790059
commit f13b13f2d4
2 changed files with 31 additions and 1 deletions

View File

@@ -81,6 +81,36 @@ if systemctl is-failed --quiet fm-api.service; then
systemctl restart fm-api.service
fi
source /etc/platform/openrc >/dev/null 2>&1 || true
# Check if rook-ceph is configured in the node via ceph-rook-store backend
if system storage-backend-list 2>/dev/null | grep -q ceph-rook-store; then
log_info "Waiting for rook-ceph application to reach applied state..."
start_time=$(date +%s)
timeout=$((15 * 60)) # 15 minutes
while true; do
# Prevent non-zero rc from triggering the ERR trap: swallow the error with || true
rook_status=$(system application-show rook-ceph --column status --format value 2>/dev/null || true)
if [ "$rook_status" = "applied" ]; then
log_info "Ready - rook-ceph application applied"
break
elif [ "$rook_status" = "apply-failed" ]; then
log_fatal "rook-ceph application failed to apply"
fi
now=$(date +%s)
elapsed=$((now - start_time))
if [ $elapsed -ge $timeout ]; then
log_warn "Timeout after $((elapsed/60)) minutes. Current rook-ceph status: $rook_status."
exit 0
fi
sleep 15
done
fi
log_info "Creating backup directory: $BACKUP_DIR"
mkdir -p "$BACKUP_DIR"
check_rc_die $? "Failed to create backup directory $BACKUP_DIR"

View File

@@ -4,7 +4,7 @@
#
# SPDX-License-Identifier: Apache-2.0
#
# Factory install system setup triggered during the config stage
# Factory install system setup triggered during the setup stage
#
echo "System Setup - Start"