LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 02-25-2009, 09:37 PM   #1
NULL Pointer
LQ Newbie
 
Registered: Jan 2008
Posts: 4

Rep: Reputation: 0
Unhappy Bad disk, bad disk controller, or bad memory?


Hi all,

I've been having what I thought were disk issues with a micro ATX box I built last year. A few months ago I started getting some file system errors on one of the 3 hard drives installed.

What would happen was my samba daemon would core dump (sorry i dont have the logfile with error messages anymore), I would reboot and the fsck would run and i'd get the following error:

Code:
Checking root file system...
/dev/hdc1 contains a file system with errors, check forced.
/dev/hdc1:
Duplicate or bad block in use!
So I used an rescue CD and ran an fsck to fix the errors. All was good for a few days and the same thing would happen again. After 3 or 4 times I start thinking its a bad disc so I disable the disc and reinstall the OS (Slackware 12) clean on one of the remaining 2 disks.

All is good for a few days. I get home tonight, look at the console and I see an error message similar to:

Code:
kernel: EXT3-fs error (device md2): htree_dirblock_to_tree: bad entry in directory #3616894: rec_len is too small for name_len - offset=103576, inode=3619715, rec_len=12, name_len=132
(my #s were different as i grabbed that snippet from a post I google'd)

I was following said post to diagnose the problem and a ran

Code:
find / -inum 3619715 -exec ls -dioF {} \;
which spewed non-filesystem related errors (sorry, didnt capture that stream as well). so i start poking around in /var/log/ to see if I can glean anything. the only think i got out of it was I had misconfigured samba to dump logs to a non-existing directory so I fix that and go to restart samba and I get something on the order of:

Code:
blah blah blah smbd needs G\LIBC and cant open /lib/ld.so.6
but the file exists probably corrupted though. after googling a bit more I'm thinking maybe its a bad stick of memory so I shutdown (which hung when unmounting the filesystems) and pulled out one of the memory chips. upon reboot i now get

Code:
Checking root file system...
/dev/hdc1 contains a file system with errors, check forced.
/dev/hdc1:
Duplicate or bad block in use!
and now i'm currently fsck'ing in single user to hopefully fix the errors.

So i'm starting to think maybe I have a bad disc controller or the box is overheating. I did however build the box with low power/low heat components (WD Green Power Drives, an AMD Sempron chip @ 1.9GHZ). I cant imagine two nearly identical disks (both WD Green Power, just different capacities). FWIW the components are housed in an Antec NSK 1380 micro ATX case.

Any thoughts as to what the problem could be?

Thanks in advance for any help!
 
Old 02-26-2009, 04:40 AM   #2
Valery Reznic
ELF Statifier author
 
Registered: Oct 2007
Posts: 676

Rep: Reputation: 137Reputation: 137
Possible culprits:
- memory/cache/cpu
- hadr drive
- hard drive controller
- board

Running memtest86 can out memory/cache/cpu problem.

You say that you tried 2 different disks and problem persist. So it's very unlikely disk.

If your board have more than one hard drive controller, try use it.
 
Old 03-01-2009, 05:21 PM   #3
NULL Pointer
LQ Newbie
 
Registered: Jan 2008
Posts: 4

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Valery Reznic View Post
Possible culprits:
- memory/cache/cpu
- hadr drive
- hard drive controller
- board

Running memtest86 can out memory/cache/cpu problem.

You say that you tried 2 different disks and problem persist. So it's very unlikely disk.

If your board have more than one hard drive controller, try use it.
Thanks for the info.
I ran memtest86 and memory/cache checked out ok.
While browsing for documentation about memtest86 I found this doc on stress testing your cpu:

http://www.ibm.com/developerworks/library/l-hw1/

Infinite kernel recompiles. Running this I found out that I have an overheating or bad CPU. I've got a new one on the way this week. Fingers crossed that the new CPU works in my ATX box otherwise i need to look at different cooling systems.

Thanks again,

NULL
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
RAID1 array down to one disk with bad blocks, clone to good disk with dd noerror? ewolf Linux - Server 2 05-10-2008 12:40 AM
Bad mount of .mdf - "wrong fs type, bad option, bad superblock, on /dev/loop0" Maybe-not Linux - General 2 02-29-2008 01:30 PM
Bad Disk BadTA SUSE / openSUSE 7 11-10-2005 01:55 PM
A disk going bad? bunchow Red Hat 1 12-03-2003 04:56 PM
USB 2.0 memory stick mount: wrong fs type, bad option, bad superblock on /dev/sda1, o olivier.riff Linux - Hardware 2 11-12-2003 08:11 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 04:04 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration