Increase mtce host offline threshold to handle slow host shutdown
Mtce polls/queries the remote host for mtcAlive messages
for 42 x 100 ms intervals over unlock or host failed cases.
Absence of mtcAlive during this (~5 sec) period indicates
the node is offline.
However, in the rare case where shutdown is slow, 5 seconds
is not long enough. Rare cases have been seen where 7 or 8
second wait time is required to properly declare offline.
To avoid the rare transient 200.004 host alarm over an
unlock operation, this update increases the mtce host
offline window from 5 to 10 seconds (approx) by modifying
the mtce configuration file offline threshold from 42 to 90.
Test Plan:
PASS: Verify unchallenged failed to offline period to be ~10 secs
PASS: Verify algorithm restarts if there is mtcAlive received
anytime during the polls/queries (challenge) window.
PASS: Verify challenge handling leads to a longer but
successful offline declaration.
PASS: Verify above handling for both unlock and spontaneous
failure handling cases.
Closes-Bug: 2024249
Change-Id: Ice41ed611b4ba71d9cf8edbfe98da4b65dcd05cf
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
This commit is contained in:
@@ -8,8 +8,8 @@ hbs_minor_threshold = 4 ; Heartbeat minor threshold count.
|
|||||||
; minor notification to maintenance.
|
; minor notification to maintenance.
|
||||||
|
|
||||||
offline_period = 100 ; number of msecs to wait for each offline audit
|
offline_period = 100 ; number of msecs to wait for each offline audit
|
||||||
offline_threshold = 46 ; number of back to back mtcAlive requests missed
|
offline_threshold = 90 ; number of back to back mtcAlive requests missed
|
||||||
; 100:46 will yield a typical 5 sec holdoff from
|
; 100:90 will yield a typical 10 sec holdoff from
|
||||||
; failed to offline
|
; failed to offline
|
||||||
|
|
||||||
inventory_port = 6385 ; The Inventory Port Number
|
inventory_port = 6385 ; The Inventory Port Number
|
||||||
|
|||||||
Reference in New Issue
Block a user