Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
My server seems to have developed a problem where the data filesystem goes nuts. Files disappear or randomly move from one folder to another. When this happens and I try to recreat the files from backup I get the error:
cp: cannot create regular file `mytest.cnf': Read-only file system
I cannot umount the partition as I get the error:
umount: /web: device is busy
The only way I have found to bring it back is to reboot, but the problem seems to reoccur.
I am running Mandriva Linux 2006. /web is ext3 running on /dev/MD0. MDADM reports the raid array as clean.
Any ideas as to what is causing this, or what I can try to fix it?
Assuming that it is not a bug with that particular release of Mandriva, it sounds like you might be having hard drive problems.
In a shell, logged in as root, type "smartclt -H". This gives you an overall health status of your hard drive using built in monitoring technology. Start there and see if that's the issue. If not it could also be a software problem, or even a misconfiguration.
Thanks. Useful looking utility. However all disks report health status as PASSED.
I don't think it's an actual bug with the O/S as the system has been running fine for over a year now. I'm baffled. Not that this is an unusual state for me when it comes to Linux!
That certainly had some effect. I got a few EXT3 errors on shutting down. And on reboot the scan produced loads of errors. I guess I have to check my data for consistency. Is it really that easy? I feel a bit stupid now!
I would say that you definitely have a drive that's about to "go."
If you have a smartctl command anywhere, say in /sbin/... read about it ("man smartctl") then run it. This will give you the drive's own error-logs and diagnostics.
Nevertheless, assume that the drive is about to conk-out and replace it immediately. (USB/Firewire external drives are very handy because you can take their drive out of it, put it into service, and put your existing drive in the external case.)
I would say that you definitely have a drive that's about to "go."
He has already run smartctl. His disks passed.
His filesystem was messed up for some reason. Maybe fsck has fixed it maybe not
CD, you should take a look in /lost+found which is where fsck puts files, or fragments of files, it does not know what to do with.
If /lost+found is empty, you are probably OK, otherwise you should save your data, and probably reinstall from scratch or restore from your backup.
What could have caused this? Maybe power-glitch / brownout / power failure / incorrect shutdown / failing drive / loose cable or connector.... ... .. .
I've looked in the Lost+Found and there's loads off stuff there. However I think I've recovered most of the missing bits from backup. Note that the error only affected the *data* drive, not the O/S drive. They are different physical drives. So I think I'm OK (for now) at least. I don't want to take a backup until I'm sure that the whole data set is valid as I only have enough disk space for one backup at a time!
I'm still baffled as to how it would have happened as I run on a UPS. I'm thinking that maybe one of the drives in the Raid array *is* on the way out, but I have a cold spare so I think I'll leave it and see what happens.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.