LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Suse/Novell
User Name
Password
Suse/Novell This Forum is for the discussion of Suse Linux.

Notices

Reply
 
Search this Thread
Old 10-25-2007, 12:00 PM   #1
betamaxman
LQ Newbie
 
Registered: May 2006
Posts: 29

Rep: Reputation: 15
Smart Error At Bootup, I get this error from suse only


I have vista, suse, and a dozen or so other nix installed on seperate partitions on my first WD sata 250 gb drive. All is golden with it, however suse 10.3 denoted my partitions with sdc rather than sda as all other nix on the drive do, and each startup displays this window with this error message.

"Your hard disk drive is failing! S.M.A.R.T. message: Device: /dev/sdc, Failed SMART usage Attribute: 190 Temperature_Celsius".
The drive seems working fine and no other os installed to the same drive displays this message. and 190 celsus? Thats four times hotter than my cpu!
 
Old 10-25-2007, 01:12 PM   #2
x_terminat_or_3
Member
 
Registered: Mar 2007
Location: Plymouth, UK
Distribution: Fedora Core, RHEL, Arch
Posts: 342

Rep: Reputation: 38
Not all operating systems monitor the SMART data.

Your drive cannot work at 190 Degrees Celcius. That figure is 3 times more what ordinary drives can take. However, if we take this value to be in Farenheit, then this corresponds to 87C which is still ~30C too hot, but more believable.

Assuming that your drive in fact, is running at 87C, then I can only recommend to shut down your pc RIGHT NOW and purchase a drive cooler. I have seen some that you can bolt on the underside of the drive, consisting of 2 fans. In my case, they bring down the temperature of the drives with about 10C, that would still be too hot for you though. Maybe you need to also drill some extra holes in your case and place more fans there. . .
 
Old 10-25-2007, 01:16 PM   #3
x_terminat_or_3
Member
 
Registered: Mar 2007
Location: Plymouth, UK
Distribution: Fedora Core, RHEL, Arch
Posts: 342

Rep: Reputation: 38
And if your drives are so hot, take a look in your /var/log/messages


grep -i tempe /var/log/messages

Would show you all entries regarding temperature

Note: suse may place the messages log somewhere else, consult /etc/syslog.conf if you cannot find the file.
 
Old 10-25-2007, 10:58 PM   #4
betamaxman
LQ Newbie
 
Registered: May 2006
Posts: 29

Original Poster
Rep: Reputation: 15
Must be a bug this what I get with
smartctl -a /dev/sdc


smartctl version 5.37 [i686-suse-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar SE (Serial ATA) family
Device Model: WDC WD2500JS-00MHB1
Serial Number: WD-WMANK1457533
Firmware Version: 10.02E01
User Capacity: 250,059,350,016 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Fri Oct 26 00:52:33 2007 ADT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (8280) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 96) minutes.
Conveyance self-test routine
recommended polling time: ( 6) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0003 206 187 021 Pre-fail Always - 4700
4 Start_Stop_Count 0x0032 098 098 000 Old_age Always - 2646
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0
9 Power_On_Hours 0x0032 091 091 000 Old_age Always - 6993
10 Spin_Retry_Count 0x0013 100 100 051 Pre-fail Always - 0
11 Calibration_Retry_Count 0x0012 100 100 051 Old_age Always - 0
12 Power_Cycle_Count 0x0032 098 098 000 Old_age Always - 2586
190 Temperature_Celsius 0x0022 073 001 045 Old_age Always In_the_past 27
194 Temperature_Celsius 0x0022 123 001 000 Old_age Always - 27
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 1
200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 
Old 10-26-2007, 12:07 AM   #5
riba43
Member
 
Registered: Feb 2005
Location: Slovenia
Distribution: suse11.0
Posts: 749

Rep: Reputation: 31
Talking

Quote:
Originally Posted by x_terminat_or_3 View Post
And if your drives are so hot, take a look in your /var/log/messages


grep -i tempe /var/log/messages

Would show you all entries regarding temperature

Note: suse may place the messages log somewhere else, consult /etc/syslog.conf if you cannot find the file.

Hi,
interesting, the same minute I started my comp, the temp of my second drive is:

Oct 26 06:37:27 riba smartd[4512]: Device: /dev/sdb, SMART Usage Attribute: 194 Temperature_Celsius changed from 91 to 90

That can not be true. Gkrellm shows the temperature is 33deg. celsius.
 
Old 10-26-2007, 01:16 AM   #6
x_terminat_or_3
Member
 
Registered: Mar 2007
Location: Plymouth, UK
Distribution: Fedora Core, RHEL, Arch
Posts: 342

Rep: Reputation: 38
Quote:
That can not be true. Gkrellm shows the temperature is 33deg. celsius.
Ah, but it is true, this confirms it. Even though it says Temperature_Celsius, the temperature is in Farenheit.

91F = 32.77C
 
Old 10-26-2007, 01:35 AM   #7
riba43
Member
 
Registered: Feb 2005
Location: Slovenia
Distribution: suse11.0
Posts: 749

Rep: Reputation: 31
Quote:
Originally Posted by x_terminat_or_3 View Post
Ah, but it is true, this confirms it. Even though it says Temperature_Celsius, the temperature is in Farenheit.

91F = 32.77C
Yes, but it says the temperature is in Celsius!!??

Oct 26 06:37:27 riba smartd[4512]: Device: /dev/sdb, SMART Usage Attribute: 194 Temperature_Celsius changed from 91 to 90

 
Old 10-27-2007, 11:18 PM   #8
betamaxman
LQ Newbie
 
Registered: May 2006
Posts: 29

Original Poster
Rep: Reputation: 15
I think the actual temp is not even reported, 190 must be the type of error.
Any way I am almost convinced it is a bug, this drive works fine otherwise.
The sda hda confusion I am thinking is some quark of my bios and the ide and sata drives used together.
Thanks for the replies.
 
Old 10-29-2007, 05:14 PM   #9
x_terminat_or_3
Member
 
Registered: Mar 2007
Location: Plymouth, UK
Distribution: Fedora Core, RHEL, Arch
Posts: 342

Rep: Reputation: 38
I realize you are not using Fedora Core, but their latest version (still 7 at the moment) started calling ALL drives sd*, maybe suse is doing something similar?
 
Old 10-29-2007, 05:17 PM   #10
x_terminat_or_3
Member
 
Registered: Mar 2007
Location: Plymouth, UK
Distribution: Fedora Core, RHEL, Arch
Posts: 342

Rep: Reputation: 38
Quote:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
3 Spin_Up_Time 0x0027 206 204 063 Pre-fail Always - 13009
4 Start_Stop_Count 0x0032 253 253 000 Old_age Always - 1067
5 Reallocated_Sector_Ct 0x0033 253 050 063 Pre-fail Always In_the_past 0
6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail Offline - 0
7 Seek_Error_Rate 0x000a 253 252 000 Old_age Always - 0
8 Seek_Time_Performance 0x0027 249 233 187 Pre-fail Always - 55136
9 Power_On_Minutes 0x0032 207 207 000 Old_age Always - 806h+54m
10 Spin_Retry_Count 0x002b 253 252 157 Pre-fail Always - 0
11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 251 251 000 Old_age Always - 1067
192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age Always - 0
193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always - 0
194 Temperature_Celsius 0x0032 253 253 000 Old_age Always - 45
This is part of the SMART output for one of my drives. As you can see, the actual temperature in degrees Celsius is given.
 
Old 10-29-2007, 05:18 PM   #11
x_terminat_or_3
Member
 
Registered: Mar 2007
Location: Plymouth, UK
Distribution: Fedora Core, RHEL, Arch
Posts: 342

Rep: Reputation: 38
So in your sample, the actual temperature of the drive (as noted above), is 27 Degrees Celsius. Don't know how I could have missed it. My apologies for scaring you like that
 
Old 11-02-2007, 09:22 PM   #12
betamaxman
LQ Newbie
 
Registered: May 2006
Posts: 29

Original Poster
Rep: Reputation: 15
No prob, I appreciate the advise.
 
Old 11-05-2007, 01:29 PM   #13
matti3
LQ Newbie
 
Registered: Nov 2007
Location: Belgium
Distribution: openSUSE 11.4, opensuse 12.1
Posts: 15

Rep: Reputation: 0
it is a known western digital firmware bug:
from http://www.bugtrack.almico.com/view.php?id=468 :
Quote:
I contacted WD and this is their response:

The temperatures reported by all SMART monitoring software is incorrect for the WD2500KS due to a firmware bug. The drive is not defective but the temperatures that the revision of the drive you have bought report to software are incorrect. We are working on a solution.

Last edited by matti3; 11-05-2007 at 01:30 PM.
 
Old 11-05-2007, 03:05 PM   #14
x_terminat_or_3
Member
 
Registered: Mar 2007
Location: Plymouth, UK
Distribution: Fedora Core, RHEL, Arch
Posts: 342

Rep: Reputation: 38
I don't know what's more interesting, that they actually replied to your mail, or that they admit to faulty firmware.
 
Old 11-05-2007, 04:05 PM   #15
matti3
LQ Newbie
 
Registered: Nov 2007
Location: Belgium
Distribution: openSUSE 11.4, opensuse 12.1
Posts: 15

Rep: Reputation: 0
it is all quite intriguing indeed

currently i am searching for a way to configure the smart daemon to either disable the temp checking (and only that), or configure an offset ala speedfan. does anyone know if this is possible?

Last edited by matti3; 11-05-2007 at 04:06 PM.
 
  


Reply

Tags
acpi, smart


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
can't install software from smart pgp key error SuSE 10.1 iroybk Suse/Novell 1 12-13-2006 06:29 PM
Smart error ionmich Suse/Novell 3 10-17-2006 02:07 AM
SMART error on channel updates - SuSE 10.1 zeb1801 Suse/Novell 11 09-29-2006 06:50 AM
error Smart Package Manager on Suse!!! suse2166 Linux - Software 2 08-08-2006 07:19 PM
Read-only file system error during bootup after installing Mandrake ontop of Suse fuoms Mandriva 1 11-27-2005 05:43 AM


All times are GMT -5. The time now is 02:49 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration