HDD Dieing?

mikeyt_333 · 05-13-2003, 07:49 PM

I ran dmesg today and got some messages that made me think my HDD was crapping out, after looking into it I found that when I try to read my /var/log/maillog.1 I get the following:

May 13 18:09:16 ns1 kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
May 13 18:09:16 ns1 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=22474338, sector=20273432
May 13 18:09:16 ns1 kernel: end_request: I/O error, dev 03:03 (hda), sector 20273432

Unfortunatly my server is hosted and they can't run an fsck till Monday, but it looks to me like a dieing HDD, I was just looking for some other possibilities. Thanks! :-)

Mike.

rch · 05-14-2003, 01:21 AM

Quote:

Originally posted by mikeyt_333
I ran dmesg today and got some messages that made me think my HDD was crapping out, after looking into it I found that when I try to read my /var/log/maillog.1 I get the following:

May 13 18:09:16 ns1 kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
May 13 18:09:16 ns1 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=22474338, sector=20273432
May 13 18:09:16 ns1 kernel: end_request: I/O error, dev 03:03 (hda), sector 20273432

Unfortunatly my server is hosted and they can't run an fsck till Monday, but it looks to me like a dieing HDD, I was just looking for some other possibilities. Thanks! :-)

Mike.

The errors is because of SMART attributes
You have turned on the smart deamon by default(smartd) in your boot scripts.
Turn it off through ntsysv .
Then you can't moniter the smart attributes.
But first check whether your harddisk smart values has not reached critical region.
If it is backup your data and use another disk.
check
smartctl -X /dev/hda

mikeyt_333 · 05-14-2003, 10:40 AM

Well, I'm not running smartd but I installed the smartctl and ran it like you suggested and came up with this:

Smart Values Read failed: Input/output error
Smartctl: Smart Values Read Failed

So based on this, it looks like it's screwed, but there's one thing, SMART is not enabled on this drive so how would it end up getting screwed at all? I ran a smartctl -a /dev/hda and got this:

Device: ST330621A Supports ATA Version 4
Drive supports S.M.A.R.T. and is disabled
Use option -e to enable

Which to me would explain why I got the read error above. Please correct me if I'm wrong. I also noticed that the errors from above started popping up when I used smartctl, so it would appear that it is in the SMART section of the HDD. Anyother suggestions, I would like to consider all options. Also, if it is with the SMART section of the HDD, what causes this to happen? Is it a malfunction?

Thanks!
Mike.

rch · 05-15-2003, 04:09 AM

Quote:

Originally posted by mikeyt_333
Well, I'm not running smartd but I installed the smartctl and ran it like you suggested and came up with this:

Smart Values Read failed: Input/output error
Smartctl: Smart Values Read Failed

So based on this, it looks like it's screwed, but there's one thing, SMART is not enabled on this drive so how would it end up getting screwed at all? I ran a smartctl -a /dev/hda and got this:

Device: ST330621A Supports ATA Version 4
Drive supports S.M.A.R.T. and is disabled
Use option -e to enable

Which to me would explain why I got the read error above. Please correct me if I'm wrong. I also noticed that the errors from above started popping up when I used smartctl, so it would appear that it is in the SMART section of the HDD. Anyother suggestions, I would like to consider all options. Also, if it is with the SMART section of the HDD, what causes this to happen? Is it a malfunction?

Thanks!
Mike.

Of course you are right.Maybe you should check SMART values enabling smartd.But I don't think that it is that serious right now,check how many values has reached the threshold.

crashmeister · 05-15-2003, 05:40 AM

Most likely your hdd is starting to develop problems.But there are also some controllers that don't play nice with the kernel and can produce this error message.Depending on mobo,kernel version and file system.
I had trouble like this with ext 2/3 file systems on my box and the partition got wrecked.Reiserfs and JFS are working without any trouble for about 6 month now.

mikeyt_333 · 05-19-2003, 10:03 AM

I'll try the "smart" thing, but I am leaning more towards a problem with the filesystem. I was using ext3 once before and all of a sudden my HDD developed problems, I posted on here or somehwere else, and I remember learning then that ext3 was buggy (less than a year ago) so perhaps this is happening again. Not sure of kernel right now, not infront of it to check and I can't remember what it was on the last update. I'll check it all out, but most likely just replace the drive, and not use ext3, maybe reiser, we'll see.

Thanks for your replies!
Mike.

rch · 05-19-2003, 11:55 PM

Quote:

Originally posted by mikeyt_333
I'll try the "smart" thing, but I am leaning more towards a problem with the filesystem. I was using ext3 once before and all of a sudden my HDD developed problems, I posted on here or somehwere else, and I remember learning then that ext3 was buggy (less than a year ago) so perhaps this is happening again. Not sure of kernel right now, not infront of it to check and I can't remember what it was on the last update. I'll check it all out, but most likely just replace the drive, and not use ext3, maybe reiser, we'll see.

Thanks for your replies!
Mike.

Ok to check that maybe you can switchto ReiserFS.I use ReiserFS and in all counts it is a better FS than ext3.As for the values , I think that smart is responsible ,only becoz I had similiar problems and whenever a smart value changed(/proc/ide/hda/smart_blah) the errors started popping up.