Smart Error At Bootup, I get this error from suse only
I have vista, suse, and a dozen or so other nix installed on seperate partitions on my first WD sata 250 gb drive. All is golden with it, however suse 10.3 denoted my partitions with sdc rather than sda as all other nix on the drive do, and each startup displays this window with this error message.
"Your hard disk drive is failing! S.M.A.R.T. message: Device: /dev/sdc, Failed SMART usage Attribute: 190 Temperature_Celsius". The drive seems working fine and no other os installed to the same drive displays this message. and 190 celsus? Thats four times hotter than my cpu! |
Not all operating systems monitor the SMART data.
Your drive cannot work at 190 Degrees Celcius. That figure is 3 times more what ordinary drives can take. However, if we take this value to be in Farenheit, then this corresponds to 87C which is still ~30C too hot, but more believable. Assuming that your drive in fact, is running at 87C, then I can only recommend to shut down your pc RIGHT NOW and purchase a drive cooler. I have seen some that you can bolt on the underside of the drive, consisting of 2 fans. In my case, they bring down the temperature of the drives with about 10C, that would still be too hot for you though. Maybe you need to also drill some extra holes in your case and place more fans there. . . |
And if your drives are so hot, take a look in your /var/log/messages
grep -i tempe /var/log/messages Would show you all entries regarding temperature Note: suse may place the messages log somewhere else, consult /etc/syslog.conf if you cannot find the file. |
Must be a bug this what I get with
smartctl -a /dev/sdc smartctl version 5.37 [i686-suse-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Model Family: Western Digital Caviar SE (Serial ATA) family Device Model: WDC WD2500JS-00MHB1 Serial Number: WD-WMANK1457533 Firmware Version: 10.02E01 User Capacity: 250,059,350,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Fri Oct 26 00:52:33 2007 ADT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (8280) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 96) minutes. Conveyance self-test routine recommended polling time: ( 6) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0003 206 187 021 Pre-fail Always - 4700 4 Start_Stop_Count 0x0032 098 098 000 Old_age Always - 2646 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 9 Power_On_Hours 0x0032 091 091 000 Old_age Always - 6993 10 Spin_Retry_Count 0x0013 100 100 051 Pre-fail Always - 0 11 Calibration_Retry_Count 0x0012 100 100 051 Old_age Always - 0 12 Power_Cycle_Count 0x0032 098 098 000 Old_age Always - 2586 190 Temperature_Celsius 0x0022 073 001 045 Old_age Always In_the_past 27 194 Temperature_Celsius 0x0022 123 001 000 Old_age Always - 27 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 1 200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. |
Quote:
Hi, interesting, the same minute I started my comp, the temp of my second drive is: Oct 26 06:37:27 riba smartd[4512]: Device: /dev/sdb, SMART Usage Attribute: 194 Temperature_Celsius changed from 91 to 90 That can not be true. Gkrellm shows the temperature is 33deg. celsius. |
Quote:
91F = 32.77C |
Quote:
Oct 26 06:37:27 riba smartd[4512]: Device: /dev/sdb, SMART Usage Attribute: 194 Temperature_Celsius changed from 91 to 90 |
I think the actual temp is not even reported, 190 must be the type of error.
Any way I am almost convinced it is a bug, this drive works fine otherwise. The sda hda confusion I am thinking is some quark of my bios and the ide and sata drives used together. Thanks for the replies. |
I realize you are not using Fedora Core, but their latest version (still 7 at the moment) started calling ALL drives sd*, maybe suse is doing something similar?
|
Quote:
|
So in your sample, the actual temperature of the drive (as noted above), is 27 Degrees Celsius. Don't know how I could have missed it. My apologies for scaring you like that ;)
|
No prob, I appreciate the advise.
|
it is a known western digital firmware bug:
from http://www.bugtrack.almico.com/view.php?id=468 : Quote:
|
I don't know what's more interesting, that they actually replied to your mail, or that they admit to faulty firmware.
|
it is all quite intriguing indeed :)
currently i am searching for a way to configure the smart daemon to either disable the temp checking (and only that), or configure an offset ala speedfan. does anyone know if this is possible? |
All times are GMT -5. The time now is 12:33 AM. |