Hi all,
I've been having what I thought were disk issues with a micro ATX box I built last year. A few months ago I started getting some file system errors on one of the 3 hard drives installed.
What would happen was my samba daemon would core dump (sorry i dont have the logfile with error messages anymore), I would reboot and the fsck would run and i'd get the following error:
Code:
Checking root file system...
/dev/hdc1 contains a file system with errors, check forced.
/dev/hdc1:
Duplicate or bad block in use!
So I used an rescue CD and ran an fsck to fix the errors. All was good for a few days and the same thing would happen again. After 3 or 4 times I start thinking its a bad disc so I disable the disc and reinstall the OS (Slackware 12) clean on one of the remaining 2 disks.
All is good for a few days. I get home tonight, look at the console and I see an error message similar to:
Code:
kernel: EXT3-fs error (device md2): htree_dirblock_to_tree: bad entry in directory #3616894: rec_len is too small for name_len - offset=103576, inode=3619715, rec_len=12, name_len=132
(my #s were different as i grabbed that snippet from a post I google'd)
I was following said post to diagnose the problem and a ran
Code:
find / -inum 3619715 -exec ls -dioF {} \;
which spewed non-filesystem related errors (sorry, didnt capture that stream as well). so i start poking around in /var/log/ to see if I can glean anything. the only think i got out of it was I had misconfigured samba to dump logs to a non-existing directory so I fix that and go to restart samba and I get something on the order of:
Code:
blah blah blah smbd needs G\LIBC and cant open /lib/ld.so.6
but the file exists probably corrupted though. after googling a bit more I'm thinking maybe its a bad stick of memory so I shutdown (which hung when unmounting the filesystems) and pulled out one of the memory chips. upon reboot i now get
Code:
Checking root file system...
/dev/hdc1 contains a file system with errors, check forced.
/dev/hdc1:
Duplicate or bad block in use!
and now i'm currently fsck'ing in single user to hopefully fix the errors.
So i'm starting to think maybe I have a bad disc controller or the box is overheating. I did however build the box with low power/low heat components (WD Green Power Drives, an AMD Sempron chip @ 1.9GHZ). I cant imagine two nearly identical disks (both WD Green Power, just different capacities). FWIW the components are housed in an Antec NSK 1380 micro ATX case.
Any thoughts as to what the problem could be?
Thanks in advance for any help!