LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 12-02-2012, 04:49 PM   #1
Todd1561
LQ Newbie
 
Registered: Oct 2012
Posts: 8

Rep: Reputation: Disabled
Need continual fsck


I have a small Iomega ix2-200 NAS that I've loaded with a custom Debian install. I've found that after a week or so I'll see odd behavior (kernel panics, services start failing, etc.). Once I run a fsck it'll report it fixed some errors and all is well again for another week or so. The configuration is 2 1TB SATA drives in a Linux-based mirror. Issues like this would tell me I'm seeing one or more physical disk failures but neither disk is reporting any S.M.A.R.T errors, even when I forcibly run the thorough test on each. I've also run the badblocks program on each and it reports no issues.

Any idea what could be causing this? I don't want to just throw new hard drives at it until I know for sure that's the problem.

Thanks,
Todd
 
Old 12-03-2012, 02:13 AM   #2
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,289

Rep: Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034
Could be bad cxn between the disks and the m/board.
(a hot fault?)
You could just try replacing one disk and see if it goes away permanently (for that disk).
If it comes back its likely not the disk(s).
 
Old 12-03-2012, 04:42 AM   #3
pan64
Senior Member
 
Registered: Mar 2012
Location: Hungary
Distribution: debian i686 (solaris)
Posts: 4,928

Rep: Reputation: 1305Reputation: 1305Reputation: 1305Reputation: 1305Reputation: 1305Reputation: 1305Reputation: 1305Reputation: 1305Reputation: 1305Reputation: 1305
you can try to run fsck from crontab, but I suggest you to find the reason instead of just fixing. Have you checked the log files?
 
Old 12-03-2012, 01:39 PM   #4
Todd1561
LQ Newbie
 
Registered: Oct 2012
Posts: 8

Original Poster
Rep: Reputation: Disabled
There is nothing in the logs that indicates a disk failure, just kernel panic messages about services that are failing. Is there not a good way to test for physical disk failures that I haven't already tried?

As for a bad connection, it's a pretty simple device, the disks just plug right into the motherboard of the NAS, no cables.

I guess I'm just looking for a more definitive answer as to what's wrong with this thing before I just start throwing parts at it. Maybe it's not even a hardware problem, could it be some bug in the Linux RAID system? I've always hated software RAID solutions (regardless of OS), but it's the only option with this device.

Todd
 
Old 12-04-2012, 03:17 AM   #5
arun5002
Member
 
Registered: Aug 2011
Location: Chennai,India
Distribution: Redhat,Centos,Ubuntu,Dedian
Posts: 549
Blog Entries: 5

Rep: Reputation: Disabled
I guess it could be a hardware issue.Instead of running continuous fsck.You can try to install mcelog using gives you better analyses of Hardware issue before server crash .I have been using in all of my server which gives you better prediction of hardware related issue before crashing down .

Look after the article

http://www.cyberciti.biz/tips/linux-...e-failure.html
 
Old 12-04-2012, 07:21 AM   #6
Todd1561
LQ Newbie
 
Registered: Oct 2012
Posts: 8

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by arun5002 View Post
I guess it could be a hardware issue.Instead of running continuous fsck.You can try to install mcelog using gives you better analyses of Hardware issue before server crash .I have been using in all of my server which gives you better prediction of hardware related issue before crashing down .

Look after the article

http://www.cyberciti.biz/tips/linux-...e-failure.html
Thanks for the suggestion, but it appears to only support x64 processors. This is a NAS unit that runs an ARM-based processor. Last night I went ahead and removed one of the drives from the array and rebuilt the array with another 1TB drive. When done rebuilding I ran a fsck and looked at the logs in /var/log/fsck/*. In the past the last line was always "fsck died with status 1" (or something like that), this last check didn't have those entries at the bottom. From my research that message means it fixed some errors, so by that line not being there I assume there were no errors to fix?

I'll let it run like this and see how it goes. If anyone has any more insight into the fsck results please let me know.

Thanks,
Todd
 
Old 12-04-2012, 08:34 AM   #7
unSpawn
Moderator
 
Registered: May 2001
Posts: 27,454
Blog Entries: 54

Rep: Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896
Did you have the same errors before you replaced EMC Lifeline?
 
Old 12-04-2012, 08:47 AM   #8
Todd1561
LQ Newbie
 
Registered: Oct 2012
Posts: 8

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by unSpawn View Post
Did you have the same errors before you replaced EMC Lifeline?
No, but in the process of switching it over to the new OS I put in a different drive for one of the 2 drives. What I did last night was put the original drive back in. That was the only hardware change I could think of between the stock setup and the new custom one, so that's why I put the old drive back in.

We'll see what happens.

Todd
 
Old 01-04-2013, 10:06 AM   #9
Todd1561
LQ Newbie
 
Registered: Oct 2012
Posts: 8

Original Poster
Rep: Reputation: Disabled
Looks like it was a hard drive problem, I swapped out the odd drive I used when I first built the NAS with the original drive model and so far it's been up for 20 days without issue.

Thanks,
Todd
 
Old 01-04-2013, 10:18 AM   #10
unSpawn
Moderator
 
Registered: May 2001
Posts: 27,454
Blog Entries: 54

Rep: Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896Reputation: 2896
Good to hear. Don't forget to mark this thread "solved" when you're ready, TIA.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Continual out-of-date warning when running make systemlordanubis Linux - Software 1 06-29-2010 06:26 AM
Poptop on Centos 5 ,continual decline in customers ed182 Linux - Networking 3 02-19-2010 02:36 PM
GRUB stuck in Continual REBOOT - Help Please...Newbie.. RazorV Ubuntu 21 12-02-2009 05:50 AM
[SOLVED] Continual Gnomenu crashing TheStarLion Linux - Software 1 11-18-2009 11:21 PM
Continual Installation Issues illusina Ubuntu 3 10-04-2005 02:34 AM


All times are GMT -5. The time now is 10:58 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration