LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (http://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   Hard disk sector error (http://www.linuxquestions.org/questions/linux-hardware-18/hard-disk-sector-error-629799/)

trebek 03-22-2008 12:24 AM

Hard disk sector error
 
Hello there lads. I've been experiencing a bit of a problem with my hard drive. I noticed that from some time back, the red drive led has been flashing steadily, even when i knew there was no activity that required that much hard drive access. So, i decided to monitor my /var/log/kern.log log to see if something was going on.

I found out the following lines been displayed when i ran a tail -f kern.log, i figured there is sort of error with the disk:
Mar 21 23:22:23 esteban-desktop kernel: [25951.004281] ide: failed opcode was: unknown
Mar 21 23:22:23 esteban-desktop kernel: [25951.004287] end_request: I/O error, dev hda, sector 5962654
Mar 21 23:22:24 esteban-desktop kernel: [25952.214286] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Mar 21 23:22:24 esteban-desktop kernel: [25952.214297] hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=5962655, sector=5962654
Mar 21 23:22:24 esteban-desktop kernel: [25952.214307] ide: failed opcode was: unknown
Mar 21 23:22:24 esteban-desktop kernel: [25952.214313] end_request: I/O error, dev hda, sector 5962654
Mar 21 23:22:25 esteban-desktop kernel: [25953.224497] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Mar 21 23:22:26 esteban-desktop kernel: [25953.224509] hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=5962655, sector=5962654
Mar 21 23:22:26 esteban-desktop kernel: [25953.224518] ide: failed opcode was: unknown
Mar 21 23:22:26 esteban-desktop kernel: [25953.224525] end_request: I/O error, dev hda, sector 5962654
Mar 21 23:22:27 esteban-desktop kernel: [25954.356814] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Mar 21 23:22:27 esteban-desktop kernel: [25954.356826] hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=5962655, sector=5962654
Mar 21 23:22:27 esteban-desktop kernel: [25954.356836] ide: failed opcode was: unknown
Mar 21 23:22:27 esteban-desktop kernel: [25954.356843] end_request: I/O error, dev hda, sector 5962654
Mar 21 23:22:27 esteban-desktop kernel: [25954.356850] printk: 3 messages suppressed.
Mar 21 23:22:27 esteban-desktop kernel: [25954.356854] Buffer I/O error on device hda2, logical block 620828
Mar 21 23:22:28 esteban-desktop kernel: [25955.466927] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Mar 21 23:22:28 esteban-desktop kernel: [25955.466939] hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=5962655, sector=5962654
Mar 21 23:22:28 esteban-desktop kernel: [25955.466948] ide: failed opcode was: unknown
Mar 21 23:22:28 esteban-desktop kernel: [25955.466955] end_request: I/O error, dev hda, sector 5962654
Mar 21 23:22:29 esteban-desktop kernel: [25956.699185] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Mar 21 23:22:29 esteban-desktop kernel: [25956.699196] hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=5962655, sector=5962654
Mar 21 23:22:29 esteban-desktop kernel: [25956.699206] ide: failed opcode was: unknown
Mar 21 23:22:29 esteban-desktop kernel: [25956.699213] end_request: I/O error, dev hda, sector 5962654
Mar 21 23:22:30 esteban-desktop kernel: [25957.720450] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Mar 21 23:22:30 esteban-desktop kernel: [25957.720462] hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=5962655, sector=5962654
Mar 21 23:22:30 esteban-desktop kernel: [25957.720472] ide: failed opcode was: unknown
Mar 21 23:22:30 esteban-desktop kernel: [25957.720479] end_request: I/O error, dev hda, sector 5962654

All of these lines were taken out after just of few seconds of running the tail command. So i believe now that there is a problem, specifically on sector 5962654, which is like the sector displayed more often on the log file.

How can i fix this? Or how can i tell linux not to use that bad sector, if indeed it is bad?

trebek 03-22-2008 12:57 AM

Hi, i decided to post back on another reply cause the other one is a bit long and things could get messy. So, this post back is about the tests i performed on my hard drive, using the smartmontools.

The command i issued to run the test is the following: smartctl -t long /dev/hda. I performed the test twice and this is what they yielded:
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 6113 -
# 2 Extended offline Completed without error 00% 6113 -

Apparently, i performed the long test, and it found nothing wrong. But the kern.log begs to differ.

On a note aside, i noticed that now the hard disk led is blinking steadily, as if telling me that there is an alert of some kind.

onebuck 03-22-2008 01:10 PM

Hi,

I would go to the HDD manufacture site and get the diagnostics for the drive. Perform those to make certain the drive is ok.

Junior Hacker 03-22-2008 01:52 PM

It is possible you have a bad sector, which a software named SpinRite would repair by re-allocating the data within and isolating the bad sector so it can't be used again. Modern hard drives normally do this on their own without third party software.
It is also possible the drive is in that process at this very moment, but can't get a perfect read of the bad sector as it keeps making passes over and over again to re-allocate the data within to a spare sector.
Or there could be some malicious tool/software in that sector causing this.

onebuck 03-22-2008 10:13 PM

Hi,

Most HDD diagnostics provide a means to diagnose and repair.

Drakeo 03-22-2008 10:53 PM

well looks like you boot from the the root sector instead of the mbr on the hda. so this goes through this then looks to the root sector on hda1 then all boots fine. and sence the mbr has never been formated and and used as a master boot record you will get this message.ide: failed opcode why no file system another is when you fdisk or cfdisk you never tagged bootable. and you boot from the root sector of the /dev/hda1 or what ever hd you want.

trebek 03-23-2008 01:48 AM

I have gone through some pages of the smartmontools man pages, but i haven't seen the option, if there is any, to fix the errors or relocate the data and isolate the bad sectors using smartctl. I'll look for one and more info as well.

On the other hand, i've got a laptop and i don't really have a problem that is visible. Nevertheless, i ran the test on the laptop too and i got the following message:
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 40% 5677 110144186
# 2 Extended offline Completed: read failure 40% 5670 110144186
# 3 Extended offline Completed without error 00% 5669 -
# 4 Extended offline Completed: read failure 40% 5469 110144186

I also got this out of the smartmontools:
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

I am confused here: does my hard drive have a problem or not??? The desk machine does show on the kern.log file that the hard drive has problems. I don't get that on my laptop, but, i get failures during the read part of the test. The other machine does not show any failures, everything says 'completed without error', and kern.log shows a bunch of errors. CONFUSING STUFF!!!

Thanks, i'll post back with the results.

JoachimJ 03-23-2008 12:07 PM

Quote:

Originally Posted by trebek (Post 3097543)
I have gone through some pages of the smartmontools man pages, but i haven't seen the option, if there is any, to fix the errors or relocate the data and isolate de bad sectors using smartctl. I'll look for one and more info as well.

From the website, it seems Smartmontools only reads SMART info. As far as I know, this is relatively minimal and therefore easily agreed-upon and standardized info that the drive's own firmware can diagnose and make publicly available. I believe Onebuck meant that you should use the HDD manufacturer's own software to perform a more thorough check outside the small world of what SMART logs.

I agree with that, and also add that 'ultimate boot cd' has a collection of both HDD manufacturer utilities (IBM/Hitachi, Fujitsu, Samsung, Seagate/Maxtor, Maxtor/Quantum, Western Digital and Excelstor) and also other diagnostics programs.
http://www.ultimatebootcd.com/

trebek 04-01-2008 02:13 PM

New hard drive
 
Well, truth is it wasn't looking very well, with the continuous red led blinking and the error messages. So i just purchased another drive and will install it soon. I'll leave that one as back up of unimportant stuff.

Thanks mates.


All times are GMT -5. The time now is 01:42 PM.