Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a small Iomega ix2-200 NAS that I've loaded with a custom Debian install. I've found that after a week or so I'll see odd behavior (kernel panics, services start failing, etc.). Once I run a fsck it'll report it fixed some errors and all is well again for another week or so. The configuration is 2 1TB SATA drives in a Linux-based mirror. Issues like this would tell me I'm seeing one or more physical disk failures but neither disk is reporting any S.M.A.R.T errors, even when I forcibly run the thorough test on each. I've also run the badblocks program on each and it reports no issues.
Any idea what could be causing this? I don't want to just throw new hard drives at it until I know for sure that's the problem.
Could be bad cxn between the disks and the m/board.
(a hot fault?)
You could just try replacing one disk and see if it goes away permanently (for that disk).
If it comes back its likely not the disk(s).
There is nothing in the logs that indicates a disk failure, just kernel panic messages about services that are failing. Is there not a good way to test for physical disk failures that I haven't already tried?
As for a bad connection, it's a pretty simple device, the disks just plug right into the motherboard of the NAS, no cables.
I guess I'm just looking for a more definitive answer as to what's wrong with this thing before I just start throwing parts at it. Maybe it's not even a hardware problem, could it be some bug in the Linux RAID system? I've always hated software RAID solutions (regardless of OS), but it's the only option with this device.
I guess it could be a hardware issue.Instead of running continuous fsck.You can try to install mcelog using gives you better analyses of Hardware issue before server crash .I have been using in all of my server which gives you better prediction of hardware related issue before crashing down .
I guess it could be a hardware issue.Instead of running continuous fsck.You can try to install mcelog using gives you better analyses of Hardware issue before server crash .I have been using in all of my server which gives you better prediction of hardware related issue before crashing down .
Thanks for the suggestion, but it appears to only support x64 processors. This is a NAS unit that runs an ARM-based processor. Last night I went ahead and removed one of the drives from the array and rebuilt the array with another 1TB drive. When done rebuilding I ran a fsck and looked at the logs in /var/log/fsck/*. In the past the last line was always "fsck died with status 1" (or something like that), this last check didn't have those entries at the bottom. From my research that message means it fixed some errors, so by that line not being there I assume there were no errors to fix?
I'll let it run like this and see how it goes. If anyone has any more insight into the fsck results please let me know.
Did you have the same errors before you replaced EMC Lifeline?
No, but in the process of switching it over to the new OS I put in a different drive for one of the 2 drives. What I did last night was put the original drive back in. That was the only hardware change I could think of between the stock setup and the new custom one, so that's why I put the old drive back in.
Looks like it was a hard drive problem, I swapped out the odd drive I used when I first built the NAS with the original drive model and so far it's been up for 20 days without issue.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.