Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
While processing a customer file that lives on my /mnt/win_h (vfat) filesystem, using Samba's connection to a Win98SE box and running from Win98 to do a file-copy operation, my app suddenly complained that the server disk (/mnt/win_h) was full. Since it's a 13-GB partition with less than 5 GB of it used, and the copy was only about 100 MB, this gave me pause. However in order to stop my process (which is actually quite a bit more than a simple copy; it was reloading a Btrieve database from a recovery file) I had to unload the Btrieve engine. When I then attempted to re-start my operation, the directory in which I was working came up blank!
At that point I switched the monitor over to see what the Linux box was telling me, and it was complaining about a directory sread error on one sector. Possibly it had found a bad sector on the drive, although the drive is less than a year old (A Maxtor 20-GB 7200RPM unit).
I couldn't find any documentation to help me diagnose the problem so I did a reboot of the Linux box from the console. In the process of closing down the system it complained repeatedly about VFAT problems on that partition. However when it came back up, no errors were reported, and the directory listings showed all files present. Sampling of them indicated that they were all intact, also.
I then copied the file set I had been processing over to another box so that I could continue processing on it. Then I turned my attention to finding out what's wrong with /mnt/win_h. So far the only clue I have is that quite a while after the reboot, when I was trying to create a new zero-length file there via Samba (to see if a problem still existed), I got a console message "Attempt to delete past EOF, Disk panic on 03:09, set read only". Doing "ls -l /dev/hda9" shows that it is indeed major 3 minor 9, and that's mounted as /mnt/win_h...
Here's my question at long last: How do I fix it? I'm prepared to copy all the data that's on that directory and its subdirectories over to some other area that doesn't have a problem, but what next? I cannot do "umount" or even "umount -f" on it; the system says it's busy and refuses. I'm familiar with disk i/o, even down at the sub-BIOS level, but I'm quite new to Linux. So what now?
Since it's a fat partition why don't you try running the windows version of scandisk on the partition and see what it turns up with. If that doesn't help much you could backup all the data and try to format the partition again.
I'd love to do just that, but Windows in its infinite "papa knows best" protectiveness won't let me run ScandiskW on any drive across a network connection, and I don't have Windows on the Linux box at all. Is there a similar (or preferably better) disk diagnostic program that I can run as root? (I also tried Norton's Disk Doctor, but it has the same limitation.)
A close check of my kernel logs shows that something went wrong with an inode during a fat_flush call, and from that point on, all attempts to read in that directory got sread errors -- some 5000 lines worth of error messages in the log! After the reboot, all was well until I attempted to access the file that had been being processed. That launched a succession of errors, culminating in the disk panic.
After backing up the 3.5 GB of data on the partition (2 ways; copied it to another box that had room, and tarballed it to another partition of the same drive) and doing much research while waiting for the backups to complete, I discovered that "mount -o rw,remount /mnt/win_h" worked to restore write capability. This morning I tried reading the bad file with "cat" and it went all the way through with no errors so this may have been just a momentary RAM glitch.
Still, I'll feel safer if I can run a disk diagnostic that verifies every sector can be read. This partition is my file server for my 5-station LAN, so it's sorta critical...
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.