An additional info log message was added for case of
running drive-audit without failed device unmounting.
Change-Id: I11abee40a712b6c6de65e63626b6f7f0a9c9f4c7
This is a follow-on from a previous commit which added recon info
for swift-drive-audit (https://review.openstack.org/#/c/122468/).
Here, the "--drievaudit" option is added to swift-recon tool. This
feature gives the statistics for the system-wide drive errors flagged
by swift-drive-audit. An example of the output is as follows:
(verbose mode)
swift-recon --driveaudit -v
===============================================================================
--> Starting reconnaissance on 5 hosts
===============================================================================
[2015-03-11 17:13:39] Checking drive-audit errors
-> http://1.2.3.4:6000/recon/driveaudit: {'drive_audit_errors': 14}
-> http://1.2.3.5:6000/recon/driveaudit: {'drive_audit_errors': 0}
-> http://1.2.3.6:6000/recon/driveaudit: {'drive_audit_errors': 37}
-> http://1.2.3.7:6000/recon/driveaudit: {'drive_audit_errors': 101}
-> http://1.2.3.8:6000/recon/driveaudit: {'drive_audit_errors': 0}
[drive_audit_errors] low: 0, high: 101, avg: 30.4, total: 152, Failed: 0.0%, no_result: 0, reported: 5
===============================================================================
Change-Id: Ia16c52a9d613eeb3de1a5a428d88dd1233631912
The drive-audit detects error log about a device and comments out it
in /etc/fstab. When the error log is generated several times, it
comments out the line for each time.
This patch makes drive-audit to check if the device is already
commented out, and prevents redundant commenting out.
Change-Id: Ia542d35b58552dde0f324bb9c42531f98c9058fa
This patch adds console logging ability to swift-drive-audit.
There are cases where logging to console is necessary when drive-audit
is done. This can be consumed for flagging errors in monitoring tools
such as icinga.
DocImpact
Change-Id: Ia1e1effcbd89bd2cf6d5b8c64019f1647c736a3a
This patch adds two new features to swift-drive-audit. The first
is an option in the drive-audit.conf file that allows the operator
to prevent the drives ever being unmounted automatically,
regardless of the amount of errors present. This could be of
benefit in very small systems consisting of only one or two drives
where the operator would like to manually unmount/fix the
particular drive(s) and minimise any potential downtime.
The second is another option in drive-audit.conf that allows the
operator to select a recon directory. This directory will then
have a drive.recon file which will keep an up-to-date record of
the swift drives and any errors associated with them. An example
of the output would be as follows:
{"/srv/node/disk2": "0", "/srv/node/disk3": "25", "/srv/node/disk0": "0",
"/srv/node/disk1": "0", "/srv/node/disk10": "0", "/srv/node/disk7": "0",
"/srv/node/disk4": "137", "/srv/node/disk5": "0", "/srv/node/disk8": "0",
"/srv/node/disk9": "0", "/srv/node/disk6": "0", "/srv/node/disk11": "60"}
This would allow the operator to monitor the errors on the swift
drives without having to spend time searching through logs. Also, if
this is accepted, it should be possible to add an option to
swift-recon that would keep track of this at a system level.
Change-Id: Ib5dacf8622b7363e070c274c7c30c8ead448a055
Making it possible for one to overwrite the default set of regexes
used to search for device block errors in the log file. Also making
the log file naming pattern configurable by setting them in the
drive-audit.conf file.
Updating "Detecting Failed Drives" section on the admin guide as well.
Change-Id: I7bd3acffed196da3e09db4c9dcbb48a20bdd1cf0
Sometimes there is no date at the beginning of a line in kern.log.
Although it does not happen often, there should be a check ensuring
the program doesn't crash in case it happens.
Added try-except block surrounding parsing string to date format.
Change-Id: I44a101266582eea2199189a006afa1037a9bd4ea
Fixes: bug #1152658
Change supports kern.log rotation in order to avoid loss
of significant information.
There is a year change functionality added as kern.log
does not keep record of year.
There is also backwards function added which allows
reading logs from the back to the front, speeding up the
execution along with the unit test for it
Fixes Bug 1080682
Change-Id: I93436c405aff5625396514000cab774b66022dd0
This required a bunch of whitespace-poking of the scripts in bin, but
that's all. Now every file in swift/ and bin/ is pep8-1.3.3-compliant,
so hopefully we can be done with this pep8 stuff for a good long time.
Change-Id: I44fdb41d219c57400a4c396ab7eb0ffa9dcd8db8