Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a dual-boot system, running 2.4.23 (RH9) on the Linux partition. I've had to switch back on forth between the two quite a bit lately and am forced to do an fsck check every third or fourth reboot.
When I come in the morning, having left the computer at the linux login screen (runlevel 3) there's often the following message on the screen...
Distribution: Slackware, (Non-Linux: Solaris 7,8,9; OSX; BeOS)
Posts: 1,152
Rep:
/boot isn't changing much, so ext3 isn't really necessary. It shouldn't really matter, but if it concerns you, you can remove the journaling from an ext3 filesystem (revert to ext2) without too much of a problem. Read the man page for tune2fs.
Is the error message always the same, or is it similar to the one above?
It's always the same message. I couldn't swear to the inode, and some of the other numbers, but other than that it's the same.
Yeah, I thought of switching back to ext2, but wouldn't that be just switching off the error message? Or could the journalling itself be causing the problem? I guess this is the same thing that's causing me to have to do a manual fsck so often.
Distribution: Slackware, (Non-Linux: Solaris 7,8,9; OSX; BeOS)
Posts: 1,152
Rep:
Quote:
Originally posted by jkobrien It's always the same message. I couldn't swear to the inode, and some of the other numbers, but other than that it's the same.
Yeah, I thought of switching back to ext2, but wouldn't that be just switching off the error message? Or could the journalling itself be causing the problem? I guess this is the same thing that's causing me to have to do a manual fsck so often.
Thanks for the reply,
John
Actually, the numbers are the important part. If they're the same, you may just have a bad cluster on your disk. If it's changing, then something more random is happening. . . IF the journal is the problem, then turning it off will do away with the error (on a level other than just hiding it).
The error IS an ext3 error. I recommend removing the journal (you can always go back to ext3
if it's not the problem) and looking for errors. . .
Shade, yes, the reboots are always clean - /sbin/reboot or /sbin/init 0.
Moses, a bad cluster could be it alright. Between filesystem checks, the error messages are identical. I've just run fsck and will wait a day or two to see if they're still the same. After that I'll convert /boot to ext2 and let you know what happens.
Distribution: Slackware, (Non-Linux: Solaris 7,8,9; OSX; BeOS)
Posts: 1,152
Rep:
If it's something physically wrong with the drive, changing partitions won't fix it. . . Back up your /boot to somewhere safe (CD-ROM) before you make any changes. . . Back up your other important stuff from this drive too.
I left my system for a few days booted to linux and didn't see the error recur. Then I rebooted to MS-Win for another few days and when I rebooted back to linux there was the same error message again. The only difference being the "rec_len = 8247" above was now "rec_len = 8259". All other numbers were identical. Could this be a symptom of a bad cluster? When I ran fsck, there was only one error reported this time, as opposed to quite a list the last time.
It seems to me like something that happens while booted to MS-Win is causing the problem. All I can think of is a nightly virus scan that our sysadmin has scheduled. I wouldn't have thought that MS-Win programs would even be aware of the linux partitions though.
I've now converted /boot to ext2 (and backed up! Good advice!) and will wait again for a few days to see what happens.
Distribution: Slackware, (Non-Linux: Solaris 7,8,9; OSX; BeOS)
Posts: 1,152
Rep:
I don't know what Windows could do to the partition. My first inclination is to think it's just a coincidence that this happened after booting to windows. However, it'll stay on the back burner. . .
How often were you rebooting to Linux? How long would the system stay in Linux? How long were you using Linux before you'd notice the error messages? Have you looked in the syslog, message, and dmesg for the error? Does it occur during boot, or at some other time?
I'm a bit anal about loose ends. So this is just to report that since converting the /boot partition back to ext2 that error hasn't re-occured and there's no sign yet of any other repercussions.
I think it was some sort of journalling error. The message came up after every third or fourth reboot or sometimes would appear at the login prompt and then wouldn't reoccur for a few weeks (which why I've waited so long to pronounce it gone).
Distribution: Slackware, (Non-Linux: Solaris 7,8,9; OSX; BeOS)
Posts: 1,152
Rep:
I'm not convinced it's just the filesystem error--usually there's a "good" reason for the error, and with filesystems, it's usually either a bug (those get reported relatively quickly on filesystems since it's very very important to have a reliable filesystem) or a disk problem. I'm guessing that your issue is actually a disk problem that isn't being "activated" as often by ext2 as it was by ext3. I would still be careful about backups. . .
If I could have tracked down the error, I would have stuck with ext3 but as I couldn't trace the directory or file node, I couldn't get anywhere with it.
Back from the "spoke-too-soon" department. That error has recurred. In slightly different format this time - presumably because the filesystem is now ext2 rather than 3.
I've been mostly using the MS-Win partition lately but had switched back to Linux occasionally with no sign of any problems. I switched to Linux again this morning and left it at the login screen (runlevel 3) over lunch. When I came back just now the above message was on screen.
I've just checked dmesg and the same error is there.
Distribution: Slackware, (Non-Linux: Solaris 7,8,9; OSX; BeOS)
Posts: 1,152
Rep:
I don't know, it really looks like a disk issue to me. . . You might be able to mitigate it with a bad blocks check using the ext2 tools, but I don't know. Physical hardware problems (especially hard disks) are difficult to get around, and once they start going from bad to worse, I've usually just given up and purchased new hardware--it's usually not worth my time to fight with bad hardware. Anyway, my suggestion is that you keep making backups of your important data (don't overwrite your old backups, make new ones) and wait until you either find the cause of the error or decide just to give up and get a new disk. If you're not having real data loss issues, you can probably make it for quite some time before you need to replace the drive.
It's also possible, though improbable, that it's not a drive issue and is, instead, a bus issue. I say improbable because the error looks like the disk is returning a bad result when a read is performed on specific bad blocks. Try the filesystem tools, I know there are ways to check for (and then mark as unusable) bad blocks on the drive. . . man ext2fs tune2fs, fsck, etc. . .
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.