[SOLVED] Recurring input/output error on my network HD
Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
As the title says, my network hard drive (ext4) that I use for storage keeps on having input/output errors. Usually a reboot or applying fsck to restore the journal fixes it, but then the problem just returns a coupe of days later. The drive is shared via NFS and Samba. Does anybody have a clue as to what might be the cause of these recurring glitches?
EDIT: The HD is new, only two months old.
BR
Last edited by LightSeeker; 09-18-2013 at 04:02 AM.
Are there any other reasons, beside a Hard drive failure as to why a journal of an ext4 formatted drive would keep getting corrupted every couple of days? I have my home server to automatically shut down and then reboot every day to save power and the drive is connected via USB,
you can probably - shut down the server without allowing the external drive to save cached data. Probably power is lost for a few seconds. You may need to force "safely remove" that drive before reboot.
By the way why do you want to reboot that host? It is quite unusual.
Thank you for your answer. How can I safely remove it? When I try to unmount it (even with force or lazy unmount) it always just says that the resource is busy. I tried checking the processes tied to it (I think it was lsof), but it was all very confusing (now when I have a problem, I just unplug the disk, plug it back in and do the fsck, then reboot).
I use rtcwake to shutdown, with sleep set to 2 seconds - perhaps it would be better to give it some more time? Or should I do something else?
I power it down in the middle of the night and turn it back on in the early morning. Since this is a home server, nobody is using it in those hours, so I turn it off to save on electricity.
Shutting the machine down should safely unmount the partitions on the disk, while unplugging when mounted is a good way to get data and/or filesystem corruption. Do you really power down the machine or is this some kind of suspend/standby?
Yes you are right, but I didn't know what else to do really, when I couldn't unmount it remotely. It would probably be doable if I would hook the server up to a monitor then try to unmount it in the file manager. But that's quite a hassle, since I have to unplug a monitor from my computer and carry it to the other end of the house, then hook it up and run a desktop environment etc.
Yes, in general shutdown should work, but in your case probably an explicit umount (or eject?) and a sleep afterwards may help. you can insert it into that script.
Have you checked the state of your disk?
I did run some checks on it with fsck -c but because the size of the disk is 3TB, the check would last 25 hours, when connected through the USB, so I always interrupted it prematurely (I needed to access the data). When exited it said it fixed a bad block and that was it.
I'm going to run fsck -vcck today to check for badblocks and let it run until completion (until tomorrow probably it's a 3 TB drive). I was also told that the input/output errors might be connected to the fact, that this is an external Hard drive that is connected to the computer via USB – because of fluctuations in power supply that the computer sends through, this might have something to do with the drive continually loosing journal information which then needs to be recovered. Is there any basis to this claim?
I went to the store and asked about the USB port that would have it's own power supply. The salesman said that he can order it, but after hearing that I have a problem with my disk that has an USB 3 output (I connect that to USB 2 on my computer), he said that it probably won't help much with regards to power supply and that a lot of people have had problems with the usb 3 based disks.
I'm confused now, I must admit WIll getting a usb port that has it's own power mean a steadier supply of power to the hard drive or not? Or should I maybe start thinking about getting a new enclosure to put the drive in that will have it's own power supply?
I tried to run a fsck test, but it just kept getting slower and slower and this morning, after more than 22 hours it was just at 20 %. Interrupted it.
what about your nfs and samba cache? Probably you can turn off all the caches (that will slow down the access but probably will avoid corruption.
I do not know if a new enclosure is cheaper (or an usb port with power supply).
You can try also another usb port (of your pc), probably you can try to use an usb2 port (maybe it is only available on the motherboard).
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.