Linux - EnterpriseThis forum is for all items relating to using Linux in the Enterprise.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Does anyone know where I can find a list of most or all possbile error codes that are displayed in dmesg?
I am working on a server verification script to run after we apply patches to systems and I want to be able to automatically detect and log all errors in dmesg without haveing to actually read it.
Hmm, not sure if such a thing exists. I think each module or component writes its own messages in its own formats. Most of them don't seem to have the kind of error codes that could be looked up in a list or used to drive automation.
What are you trying to do in the verification scripts? dmesg pretty much is "automatically detect and log all errors". I assume you want to get alerts to another system for error conditions?
Well after we apply patched/updates we always scan dmesg to see if there are bad config options, failed hardware messages, segfaults, etc. So I am just trying to cut out the time it takes to read all of the dmesg manually and just write anything suspicious found to a log I have. Eventually I will also do the same thing for other system logs such as /var/log/messages and things. But as of now dmesg is the hardest one to find issues with. So I want to get it knocked out first.
Essentially I have a script now that remotely checks a node to gather infomation about the health of the machine and writes it to a local log. The after it gets that information it displays dmesg and /var/log/messages for manual review. I want to be able to completely automate it so I can execute the script and jsut read the local log only.
time.c: can't update CMOS clock from 9 to 55
nfs: server leviticus02 not responding, still trying
nfs: server leviticus02 OK
ipv6: Unknown parameter `off'
set_rtc_mmss: can't update from 5 to 59
VMCIUtil: Updating context id from 0xffffffff to 0x3c7e92e8 on event 0.
portmap: server localhost not responding, timed out
RPC: failed to contact portmap (errno -5).
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-3, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-5, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Adding 1048568k swap on /dev/VG00/LV01. Priority:-1 extents:1
Adding 1048568k swap on /dev/sde. Priority:-2 extents:1
VMware memory control driver initialized
Clock: inserting leap second 23:59:60 UTC
Then I have a script like this
Code:
#!/bin/bash
HOST="$1"
DMESGFILE="/tmp/dmesg.$HOST"
LOG="dmesg.log"
ERRORS="'Unknown parameter' 'can\'t update CMOS clock' 'not responding' '0x' 'inserting leap second'"
while read line; do
for i in $ERRORS; do
echo $line | awk "/$i/"
done
done < $DMESGFILE
Then I can run
Code:
$ ./parse_dmesg.sh fake
And it will return this
Code:
time.c: can't update CMOS clock from 9 to 55
time.c: can't update CMOS clock from 9 to 55
set_rtc_mmss: can't update from 5 to 59
Clock: inserting leap second 23:59:60 UTC
Well I would do it with regular expressions (in Perl or Python) not plain string matches ... but still I think you are going to have to cook up your own list of warning matches. I'm not aware of any general standards for errors in dmesg.
I'm pretty sure there's no std; get yourself a good thesaurus eg http://thesaurus.com/ and lookup words like "don't", "can't", "fail", "bad" & friends ...
Good luck, you're going to need it ...
You could try homing in on words related to whatever you just changed ...
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.