d37490b814
This update adds common plugin support for alarm state auditing. The audit is able to detect and correct the following alarm state errors: Error Case Correction Action ----------------------- ----------------- - stale alarm ; delete alarm - missing alarm ; assert alarm - alarm severity mismatch ; refresh alarm The common audit is enabled for the fm_notifier plugin that supports alarm managment for the following resources. - CPU with alarm id 100.101 - Memory with alarm id 100.103 - Filesystem with alarm id 100.104 Other plugins may use this common audit in the future but only the above resources have the audit enabled for them by this update. Test Plan: PASS: Verify stale alarm detection/correction handling PASS: Verify missing alarm detection/correction handling PASS: Verify alarm severity mismatch detection/correction handling PASS: Verify hosts only audits its own specified alarms PASS: Verify success path of monitoring a single and mix of base and instance alarms of varying severity while such alarm conditions come and go PASS: Verify alarm audit of mix of base and instance alarms over a collectd process restart PASS: Verify audit handling of alarm that migrates from major to critical to major to clear PASS: Verify audit handling transition between alarm and no alarm conditions PASS: Verify soak of random cpu, memory and filesystem overage alarm assertions and clears that also involve manual alarm deletions, assertions and severity changes that exercise new audit features Regression: PASS: Verify alarm and audit handling over Swact with mounted filesystem that has active alarm PASS: Verify collectd logs following a system install and while alarms are managed during above soak PASS: Verify behavior while FM is killed or stopped/started PASS: Verify Standard system install with Sanity and Regression PASS: Verify AIO DX/DC systems install with Sanity and Regression Closes-Bug: 1925210 Change-Id: I1cafd17ad07ec769240de92ae4e67cb1357f0992 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>