LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 11-09-2012, 06:10 AM   #1
Iyyappan
Member
 
Registered: Dec 2008
Location: Chennai, India
Distribution: CentOS 5, SLES 11
Posts: 229

Rep: Reputation: 4
MCE Log


Hi,
I have installed mcelog in RHEL 6.1 x64bit. I have doubt in this. Currently there is no logs updated in /var/log/mcelog. Is there a way to test whether mcelog is working properly. /var/log/mcelog is not getting updated. Is there any configuration change to be made in /etc/mcelog/mce.conf file.

mcelog version is mcelog-1.0pre3_20101112-0.6.el6.x86_64

mcelog --client -------Does not produce any o/p

Now I do not know whether MCElog is working properly or not

Last edited by Iyyappan; 11-09-2012 at 07:18 AM.
 
Old 11-10-2012, 12:55 PM   #2
btmiller
Senior Member
 
Registered: May 2004
Location: In the DC 'burbs
Distribution: Arch, Scientific Linux, Debian, Ubuntu
Posts: 4,119

Rep: Reputation: 315Reputation: 315Reputation: 315Reputation: 315
Normally, you wouldn't expect any entries in the mcelog. It reports machine check exceptions, which are serious hardware errors. If you don't see any errors logged, that's a good sign! Of course, you may still have flakey hardware, but if it's not being stressed the problems might not show up. Are you running CPU or memory intensive programs on the machine? You could always run hardware diagnostic/stress test tools (e.g. memtest86+ for memory) on the hardware to see if they report errors.
 
Old 11-15-2012, 09:00 PM   #3
Iyyappan
Member
 
Registered: Dec 2008
Location: Chennai, India
Distribution: CentOS 5, SLES 11
Posts: 229

Original Poster
Rep: Reputation: 4
My MCElog conf looks like this. Syslog is not used in my machine. Is this conf fine. Except /tmp/logfile. No logs are logged. I doubt whether mcelog is working fine or not. Can anyone send a link which explains mcelog options in mcelog.conf. I went through man page and mcelog --help. But still, I need info on options like "optionname, filter,

[root@server mcelog]# cat /etc/mcelog/mcelog.conf
#
# Example config file for mcelog
# mcelog is the user space backend that decodes and process machine check events
# (cpu hardware errors) reported by the CPU to the kernel
#

# general format
#optionname = value
# white space is not allowed in value currently, except at the end where it is dropped
#

# in general all command line options that are not commands work here
# see man mcelog or mcelog --help for a list
# e.g. to enable the --no-syslog option use
#no-syslog = yes (or no to disable)
# when the option has a argument
logfile = /tmp/logfile
# below are the options which are not command line options

# Set CPU type for which mcelog decodes events:
cpu = intel
# for valid values for type please see mcelog --help
# If this value is set incorrectly the decoded output will be likely incorrect.
# by default when this parameter is not set mcelog uses the CPU it is running on
# on very new kernels the mcelog events reported by the kernel also carry
# the CPU type which is used too when available and not overriden.

# Enable daemon mode:
daemon = yes
# By default mcelog just processes the currently pending events and exits.
# in daemon mode it will keep running as a daemon in the background and poll
# the kernel for events and then decode them.

# Filter out known broken events by default
filter = yes
# don't log memory errors individually
# they still get accounted if that is enabled
filter-memory-errors = no

# output in undecoded raw format to be easier machine readable
# (default is decoded)
raw = yes

# Set CPU Mhz to decode uptime from time stamp counter (output
# unreliable, not needed on new kernels which report the event time
# directly. A lot of systems don't have a linear time stamp clock
# and the output is wrong then.
# Normally mcelog tries to figure out if it the TSC is reliable
# and only uses the current frequency then.
# Setting a frequency forces timestamp decoding.
# This setting is obsolete with modern kernels which report the time
# directly.
cpumhz = 2992.697

# log output options
# Log decoded machine checks in syslog (default stdout or syslog for daemon)
#syslog = yes
# Log decoded machine checks in syslog with error level
#syslog-error = yes
# Never log anything to syslog
no-syslog = yes
# Append log output to logfile instead of stdout. Only when no syslog logging is active
#logfile = filename

# Use SMBIOS information to decode DIMMs (needs root)
# This function is not recommended to use right now and generally not needed
# The exception is memdb prepopulation, which is configured separately below.
#dmi = no

# when in daemon mode run as this user after set up
# note that the triggers will run as this user too
# setting this to non root will mean that triggers cannot take some corrective
# action, like offlining objects
#run-credentials-user = root
# group to run as daemon with
# default to the group of the run-credentials-user
#run-credentials-group = nobody

[server]
# user allowed to access client socket.
# when set to * match any
# root is always allowed to access
# default: root only
client-user = root
# group allowed to access mcelog
# when no group is configured any group matches (but still user checking)
# when set to * match any
client-group = root
# path to the unix socket for client<->server communication
# when no socket-path is configured the server will not start
socket-path = /var/run/mcelog-client
# when mcelog starts it checks if a server is already running. timeout
# for this check.
initial-ping-timeout = 2
#
[dimm]
# Is the in memory DIMM error tracking enabled?
# Only works on systems with integrated memory controller and
# which are supported
# Only takes effect in daemon mode
dimm-tracking-enabled = yes
# Use DMI information from the BIOS to prepopulate DIMM database
# Note this might not work with all BIOS and requires mcelog to run as root.
# Alternative is to let mcelog create DIMM objects on demand.
dmi-prepopulate = yes
#
# execute these triggers when the rate of corrected or uncorrected
# errors per DIMM exceeds the threshold
# Note when the hardware does not report DIMMs this might also
# be per channel
# The default of 10/24h is reasonable for server quality
# DDR3 DIMMs as of 2009/10
#uc-error-trigger = dimm-error-trigger
uc-error-threshold = 1 / 24h
#ce-error-trigger = dimm-error-trigger
ce-error-threshold = 10 / 24h

[socket]
# Memory error accounting per socket
socket-tracing-enabled = yes
# Threshold and trigger for uncorrected memory errors on a socket
# mem-uc-error-trigger = socket-memory-error-trigger
mem-uc-error-threshold = 100 / 24h
# Threshold and trigger for corrected memory errors on a socket
#mem-ce-error-trigger = socket-memory-error-trigger
#mem-ce-error-threshold = 100 / 24h
# Log socket error threshold explicitely?
#mem-ce-error-log = yes


[cache]
# Processing of cache error thresholds reported by Intel CPUs
#cache-threshold-trigger = cache-error-trigger
# Should cache threshold events be logged explicitely?
#cache-threshold-log = yes

[page]
# Memory error accouting per 4K memory page
# Threshold for the correct memory errors trigger script
memory-ce-threshold = 10 / 24h
# Trigger script for corrected errors
# memory-ce-trigger = page-error-trigger
# Should page threshold events be logged explicitely?
memory-ce-log = yes
# specify the internal action in mcelog to exceeding a page error threshold
# this is done in addition to executing the trigger script if available
# off no action
# account only account errors
# soft try to soft-offline page without killing any processes
# This requires an uptodate kernel. Might not be successfull.
# hard try to hard-offline page by killing processes
# Requires an uptodate kernel. Might not be successfull.
# soft-then-hard First try to soft offline, then try hard offlining
#memory-ce-action = off|account|soft|hard|soft-then-hard
memory-ce-action = soft

[trigger]
# Maximum number of running triggers
children-max = 2
# execute triggers in this directory
directory = /etc/mcelog
 
Old 11-25-2012, 11:16 AM   #4
Iyyappan
Member
 
Registered: Dec 2008
Location: Chennai, India
Distribution: CentOS 5, SLES 11
Posts: 229

Original Poster
Rep: Reputation: 4
Will mcelog installation and setting up cron would be enough or do we need to change any settings in mcelog.conf file

Can anyone provide configurations to be made in mcelog.conf file. Any brief site which explains the settings.

When I run /etc/init.d/mcelogd status as a non - root user, I get a message saying that /dev/mcelog is not active. How to fix this

Last edited by Iyyappan; 11-25-2012 at 12:18 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: why Linux MCE is superior to windows MCE LXer Syndicated Linux News 0 02-23-2009 09:02 PM
what do (no)mce and (no)mci mean? newbiesforever Linux - General 1 12-06-2008 11:14 PM
What is Linux MCE all about? Nebetsu Linux - Distributions 1 03-25-2007 08:55 PM
I have a MCE '05 PC...can i watch tv in linux? darek214 Linux - Newbie 1 02-22-2007 11:13 PM
MCE remote, wake up Xerop Linux - Hardware 2 11-16-2006 05:00 PM


All times are GMT -5. The time now is 11:46 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration