LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 06-13-2012, 07:05 AM   #1
tocard
LQ Newbie
 
Registered: Jun 2012
Location: Paris
Distribution: Debian
Posts: 2

Rep: Reputation: Disabled
How to monitor memory issues ?


Hello,

I've been using meclog for some time now. I never had to complain about it. It does the job.

But recently, I had this:

Quote:
Hardware event. This is not a software error.
MCE 0
CPU 16 BANK 9
TIME 1338562802 Fri Jun 1 17:00:02 2012
MCG status:
MCi status:
Corrected error
Error enabled
MCA: MEMORY CONTROLLER RD_CHANNELunspecified_ERR
Transaction: Memory read error
STATUS 900000400800009f MCGSTATUS 0
MCGCAP 1000c18 APICID 80 SOCKETID 2
CPUID Vendor Intel Family 6 Model 47
Ok, so it looks like a memory issue. But how can I know what slot is affected?

Looking for specific information concerning IBM hardwre, I found the following page :
http://www-947.ibm.com/support/entry...d=MIGR-5084973

Quote:
Do not use the Linux MCE daemon.
IBM recommends to not deploy these programs on System x servers, which have system firmware and an Integrated Management Module (IMM) to properly interpret correctable error counts, accommodate hardware errata, provide predictive failure alerts, and take system actions to prevent uncorrectable errors.
HP, and over harware vendors have similar pages.

Ok, so I can't rely on mcelog. What should I use to monitor memory on these servers?

I'm running a memtest86+, but it would be better if I could check issues without shutting down the OS during a whole day.

Thanks.
 
Old 06-14-2012, 12:30 AM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,348

Rep: Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749
Well that text says
Quote:
provide predictive failure alerts,
so presumably there's a way to access them, probably a dedicated rpm file from IBM.
I'd contact your vendor (IBM or agent) or google it.

As an example, HP provide a PSP (product support pack) that is basically a collection of rpms that provide monitoring services and such-like.

Last edited by chrism01; 06-14-2012 at 12:31 AM.
 
Old 06-19-2012, 06:48 AM   #3
tocard
LQ Newbie
 
Registered: Jun 2012
Location: Paris
Distribution: Debian
Posts: 2

Original Poster
Rep: Reputation: Disabled
Hello,

Thanks for your answer.


It seems that hpasmcli will do the trick.

I'm not sure of what to use for IBM.
 
Old 12-03-2012, 04:51 AM   #4
Iyyappan
Member
 
Registered: Dec 2008
Location: Chennai, India
Distribution: CentOS 5, SLES 11
Posts: 245

Rep: Reputation: 4
Quote:
Originally Posted by tocard View Post
Hello,

I've been using meclog for some time now.
.
Can you provide the steps along with the configuration you have made. I have installed MCElog, I need to check whether the installed one is working properly
 
Old 12-04-2012, 07:22 PM   #5
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,348

Rep: Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749
This may help http://www.cyberciti.biz/tips/linux-...e-failure.html
 
Old 12-06-2012, 03:04 AM   #6
Iyyappan
Member
 
Registered: Dec 2008
Location: Chennai, India
Distribution: CentOS 5, SLES 11
Posts: 245

Rep: Reputation: 4
In MCElog.conf file there are many options which are enabled by default and many options are commented by default. Do we need to change any options here. I am not using syslog to store logs. Should I mention the CPU type, frequency and other details ?. I read in a post that the MCELog will take the currently using CPU and produce an error o/p if anything goes wrong with CPU. Have I understood correctly.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
memory metric to monitor memory usage or swap? karlochacon Linux - Newbie 5 08-13-2011 03:49 PM
Difference between resident memory,shared memory and virtual memory in system monitor mathimca05 Linux - Newbie 1 11-11-2007 04:05 AM
how to monitor memory leak in C leosgb Programming 7 04-17-2006 01:47 PM
How to monitor low memory? Fredy71 Linux - General 9 12-23-2004 10:05 AM
Help!?! RH 8 Memory Mapping -High Memory-Virtural Memory issues.. Merlin53 Linux - Hardware 2 06-18-2003 04:48 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 10:39 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration