LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   WARNING!: Ext4 data corruption kernel bug on the prowl. (https://www.linuxquestions.org/questions/slackware-14/warning-ext4-data-corruption-kernel-bug-on-the-prowl-4175433900/)

GazL 10-24-2012 02:21 PM

WARNING!: Ext4 data corruption kernel bug on the prowl.
 
Stable kernels not so stable (again): Film at 11.

http://lwn.net/Articles/521022/
Quote:

In short: ext4 users would be well advised to avoid versions 3.4.14, 3.4.15, 3.5.7, 3.6.2, and 3.6.3; they all contain a patch which can, in some situations, cause filesystem corruption.
From the posts I've read it doesn't sound like the kernel devs are 100% certain what is going on just yet. No mention of whether Ben Hutchings' 3.2.y branch is affected but probably best to be careful out there until more comes to light.

Best keep a close eye on the news sites such as lwn for a while.

jtsn 10-24-2012 04:19 PM

Ext2 filesystem corruption was my main reason for switching mission-critical servers to BSD back in 2000 (and back to Slackware as ext3 settled in). So this is not the first time, but the continuation of a long sad story.

T3slider 10-24-2012 04:23 PM

And for once in my life I've kept up with the 3.4 branch diligently. FML. ;)

astrogeek 10-24-2012 04:28 PM

I've installed my Slackware 14 with ext3 for ability to mount NFS on some older boxen...

Once more, lagging behind has an unexpected advantage! But it is getting crowded back here!

D1ver 10-24-2012 04:58 PM

Sounds like it was a good time to experiment with XFS..

GazL 10-24-2012 05:13 PM

Quote:

Originally Posted by D1ver (Post 4814159)
Sounds like it was a good time to experiment with XFS..

:) How's that working out for you?

Beelzebud 10-24-2012 05:16 PM

More confirmation that I made the right choice by making Slackware my distro. As if I needed any more. :)

D1ver 10-24-2012 05:26 PM

Quote:

Originally Posted by GazL (Post 4814164)
:) How's that working out for you?

[OT]
Seems great so far. I watched this presentation on recent improvements in XFS and decided to try it out. I'm using it on a 128 gig SSD laptop, which is probably not the best use case for XFS.

I've got a 2tb RAID 1 home media server running Slack 14 with ext4 at the moment, I'm considering reformatting that over to XFS, as lots of big files are supposed to be the XFS's strong suit.

[/OT]

Gerard Lally 10-24-2012 07:51 PM

Quote:

Originally Posted by jtsn (Post 4814144)
Ext2 filesystem corruption was my main reason for switching mission-critical servers to BSD back in 2000 (and back to Slackware as ext3 settled in). So this is not the first time, but the continuation of a long sad story.

I was reading Greg Kroah-Hartmann's dismissive comments about NetBSD the other day. This puts his rather smug and condescending attitude in perspective, doesn't it?

But it's just a major Linux filesystem. Nothing important.

damgar 10-24-2012 08:20 PM

Quote:

Originally Posted by D1ver (Post 4814159)
Sounds like it was a good time to experiment with XFS..

Yeah and here I was and decided to finall play with ext4 again on my new raid0 setup on a system that has a 1 TB partition and and 10GB files, when I had just got comfortable with XFS after using ext3 sine 13.1. At least I got the bugs worked out to where I can just format and reinstall after a quick copy over....

Edit: Oh wait, I had only built a 3.6.2 kernel to expriment with and I have ALREADY formatted and reinstalled using the original 3.2.x kernel that came stock with 14.0! Now I just have to create an XFS partition going forward and I'm good! Good call Pat! :D

TobiSGD 10-24-2012 08:28 PM

Every software has bugs and some of them are critical. File-system drivers are software, so sooner or later any of them can be the victim of a critical bug. Has this error actually occurred to anyone? Where your backups also affected?

jtsn 10-24-2012 09:13 PM

Quote:

Originally Posted by gezley (Post 4814248)
I was reading Greg Kroah-Hartmann's dismissive comments about NetBSD the other day. This puts his rather smug and condescending attitude in perspective, doesn't it?

The BSDs have a different development model with a -stable and a -current branch. For a reason.

Quote:

Originally Posted by TobiSGD (Post 4814270)
Every software has bugs and some of them are critical. File-system drivers are software, so sooner or later any of them can be the victim of a critical bug.

But if you have a sane release engineering (which our beloved kernel has not), such critical bugs don't hit your end-users.

ReaperX7 10-24-2012 11:02 PM

Another Red Hat goon named Greg Kroah-Hartmann posted something else negative towards BSD eh? Not surprising. He must be a friend of Lennart.

sombragris 10-24-2012 11:22 PM

From the 14.0 ChangeLog:

Quote:

Code:

Fri Aug 24 20:08:37 UTC 2012
This is Slackware 14.0 release candidate 3, and is hopefully the last stop
on our long road to a stable Slackware release soon. After hearing that
the 3.4.x kernel series will have long term support, I tested 3.4.9 hoping
that it would prove stable enough to use that as the release kernel, but
there are problems with an oops in kernel/time/clocksource.c every few boots.
Given that the 3.2.x series has been very stable, it seems prudent to stick
with that for release, and 3.2.28 is going to be the release kernel. So,
one more round of testing. Let me know if there are any problems. Thanks!


Thank you Pat!! You rock!

damgar 10-25-2012 12:01 AM

Quote:

Originally Posted by ReaperX7 (Post 4814327)
Another Red Hat goon named Greg Kroah-Hartmann posted something else negative towards BSD eh? Not surprising. He must be a friend of Lennart.

From Wikipedia:
Quote:

Greg Kroah-Hartman is a Linux kernel developer. He is the current Linux kernel maintainer for the -stable branch with Chris Wright,[2] the staging subsystem,[2] USB,[2] driver core, debugfs, kref, kobject, and the sysfs kernel subsystems,[2] Userspace I/O (with Hans J. Koch)[2] and TTY layer.[2] He is also the maintainer of the linux-hotplug and created the udev projects. Additionally, he helps to maintain the Gentoo Linux packages for these programs, and helps with the kernel package. He worked for Novell in the SUSE Labs division and, as of 1 February 2012, works at the Linux Foundation.[1] He is currently[when?] working full time on the Linux Driver Project.[3]
He is a co-author of Linux Device Drivers (3rd Edition)[4] and author of Linux Kernel in a Nutshell,[5] and used to be a contributing editor for Linux Journal. He also contributes articles to LWN.net, the news computing site.
Kroah-Hartman frequently helps in the documentation of the kernel and driver development through talks[6][7] and tutorials.[8][9] In 2006, he released a CD image of material to introduce a programmer to working on Linux device driver development.[10]

ponce 10-25-2012 12:08 AM

he likes good muzak too http://bit.ly/RX8Sk8 :)

Petri Kaukasoina 10-25-2012 12:10 AM

See https://lkml.org/lkml/2012/10/24/620
It's from the original reporter of the problem. He says for example: "It occurs to me that it is possible that this bug hits only those filesystems for which a umount has started but been unable to complete. If so, this is a relatively rare and unimportant bug which probably hits only me and users of slow removable filesystems in the whole world..." and "Verified! You do indeed need to do passing strange things to trigger this bug -- not surprising, really, or everyone and his dog would have reported it by now. As it is, I'm sorry this hit slashdot, because it reflects unnecessarily badly on a filesystem that is experiencing problems only when people do rather insane things to it."

So, it seems the bug bites if the system is rebooted or powered off while umount is running but has not yet finished.

I wouldn't change the filesystem quite yet...

damgar 10-25-2012 12:19 AM

I've always had especially bad luck with corrupting data and filesystem's with ext4. That's not to say it's necessarily the the file system's fault, just that it's happened numerous times to me. It's just never been very fault tolerant in my experience. I just now came back to ext4 because it seems like xfs will severely crunch my system for a moment during heavy read/write so it is disconcerting for bugs to be showing up. Anyway I'm not using supposedly affected kernels and I'm still not keeping data on ext4 so we shall see what happens.

damgar 10-25-2012 12:21 AM

Quote:

Originally Posted by ponce (Post 4814341)
he likes good muzak too http://bit.ly/RX8Sk8 :)

HA! Ya gotta love a little Jello Lennon :D

H_TeXMeX_H 10-25-2012 02:27 AM

Another reason I don't use the popular ext* filesystems. They don't have a good history and it continues up to today.

ppencho 10-25-2012 04:47 AM

I don't know if this was related to the 'ext4' issue but I had a data loss (single file) 2-3 weeks ago. I was working on a project in Qt Creator when there was a power breakage. The UPS (APC but very old) saved me for few seconds but it switched off while PC was still powered. I don't remember the exact PC state at this moment: if I closed QT Creator or if I left KDE or if I started shutdown sequence, it happened so fast.
Next time I loaded the project one of the cpp files was empty (0 length).

I use Slackware64-current.

TobiSGD 10-25-2012 05:18 AM

Quote:

Originally Posted by H_TeXMeX_H (Post 4814410)
Another reason I don't use the popular ext* filesystems. They don't have a good history and it continues up to today.

I would maybe see it the other way around. Because they are popular more bugs are are reported, which gives them a "bad history", but makes them more stable in the end. Like this one occurs only under extreme conditions, but was found nonetheless.

Pixxt 10-25-2012 06:07 AM

Quote:

Originally Posted by jtsn (Post 4814144)
Ext2 filesystem corruption was my main reason for switching mission-critical servers to BSD back in 2000 (and back to Slackware as ext3 settled in). So this is not the first time, but the continuation of a long sad story.

I quit using FreeBSD years ago around the 4.x-5.x series because it would not boot/ or boot and crash because i was using a usb keyboard and other oddities, besides that I never found FreeBSD to be more stable or faster than Linux. I do prefer some of its userland. FreeBSD and NetBSD newish init systems rock, it does not have the LP mess of Pulseaudio and SystemD but the kernels are are trash.

BSD users/devs are still the most elitist pricks this side of Apple. I would love to have a BSD like system with a Linux kernel, the closest I have come is Slackware.


Quote:

Originally Posted by H_TeXMeX_H (Post 4814410)
Another reason I don't use the popular ext* filesystems. They don't have a good history and it continues up to today.

The EXT's are the least problematic general purpose filesystems in Linux, and better tested and a more bug free better track record than XFS, ReiserFS, JFS, BTRFS.

Gerard Lally 10-25-2012 06:58 AM

Quote:

Originally Posted by Pixxt (Post 4814531)
BSD users/devs are still the most elitist pricks this side of Apple.

Strong words. I don't consider myself elitist and I have found the NetBSD devs and users anything but elitist.

H_TeXMeX_H 10-25-2012 07:21 AM

Quote:

Originally Posted by Pixxt (Post 4814531)
The EXT's are the least problematic general purpose filesystems in Linux, and better tested and a more bug free better track record than XFS, ReiserFS, JFS, BTRFS.

As people often tell me nowadays: prove it.

Ok, maybe I won't be that mean, how about some examples or articles ?

Honestly I think the bugs are due to bad decisions made by the devs, like when they first released ext4 and made the default 'data=writeback'. It was changed eventually, but it took time and data loss.

qweasd 10-25-2012 12:26 PM

Well, I am very happy with Pat's 3.2.29 everywhere but my laptop, where I have to run at least 3.3 on the account of Ivy Bridge. I can't pretend to understand this thread, but it looks like everything from 3.3 and up may be affected, so I guess I just have to not reboot until they fix it :)

GazL 10-25-2012 12:33 PM

Now that more detail's have come out I don't think there is much cause for concern: unless you're system has a tendency not to shutdown cleanly for some reason.

Pixxt 10-25-2012 03:11 PM

Quote:

Originally Posted by GazL (Post 4814785)
Now that more detail's have come out I don't think there is much cause for concern: unless you're system has a tendency not to shutdown cleanly for some reason.

https://plus.google.com/117091380454...ts/Wcc5tMiCgq7

You're right!

Martinus2u 10-25-2012 03:17 PM

Quote:

Originally Posted by TobiSGD (Post 4814270)
Has this error actually occurred to anyone?

I have lost a root file system on one machine last week. I can't say for certain it was the bug in question. It certainly followed precisely the pattern that is said to trigger the bug, but on the other hand the machine wasn't on 3.5.7 yet. I haven't had the time yet to assess the situation and salvage whatever I can, followed by a re-install.

irgunII 10-26-2012 10:58 AM

ReiserFS kicks ass. It's not let me down in 12 years of use. Started using it with SuSE 7.3 through openSUSE 11.3, then in Slackware 13.37 and now 14.0

Petri Kaukasoina 10-28-2012 02:18 AM

It seems that the bug has been found. https://lkml.org/lkml/2012/10/28/1

Kernel versions 3.4, 3.5, 3.6 have the bug. Kernels 3.2 and 3.3 are ok. Slackware-14.0 uses kernel 3.2.29 which is fine.

(Note that the bug was not added in any stable kernel iterations 3.4.x or 3.6.x. It was already in the original 3.4.)

GazL 10-28-2012 05:35 AM

From another of Eric's posts in the thread linked to by Petri:
Quote:

So anyway, turning on journal_async_commit (notionally unsafe) enables journal_checksum (apparently broken).
So, turning on a potentially unsafe option, turns out to be actually unsafe, because it is broken!
I'll mark this as SOLVED. Anyone using sensible mount-options won't be hitting this.

EdGr 10-28-2012 09:08 AM

Quote:

Originally Posted by irgunII (Post 4815565)
ReiserFS kicks ass. It's not let me down in 12 years of use. Started using it with SuSE 7.3 through openSUSE 11.3, then in Slackware 13.37 and now 14.0

Be careful.

ReiserFS used to be good, but in recent kernels it has both momentary data corruption and performance problems. http://www.linuxquestions.org/questi...xt4-4175425294

These problems went away when I switched to ext4. Since not many people use or care about ReiserFS, its quality has gone down hill.
Ed

jstg 10-28-2012 09:18 AM

Quote:

Originally Posted by EdGr (Post 4816733)
Be careful.

ReiserFS used to be good, but in recent kernels it has both momentary data corruption and performance problems. http://www.linuxquestions.org/questi...xt4-4175425294

These problems went away when I switched to ext4. Since not many people use or care about ReiserFS, its quality has gone down hill.
Ed

I liked RieserFS. It was fast. Or so it seemed. Somewhere along the way though I switched to ext4 for reasons I can't even remember.

irgunII 02-19-2013 10:39 AM

ReiserFS is working fine on this system I have now (ASUS M5A97LE R2.0, AMD Athlon II X2 260, 4GB RAM) and on my old system (ASRock A770DE+, AMD Athlon II X2 250, 2GB RAM which I gave to my 73 year old mom as *her* old system was a P-3!). I've yet, in all these years of reiserFS use exclusively had a problem with it. I'll keep using it until the day it *does* start to screw up, but in thirteen years this month (so far) it hasn't and it is nice and stable and takes the beatings of power outages and brown-outs extremely well, imho.

salemboot 02-19-2013 10:35 PM

Quote:

Originally Posted by irgunII (Post 4895193)
ReiserFS is working fine on this system I have now (ASUS M5A97LE R2.0, AMD Athlon II X2 260, 4GB RAM) and on my old system (ASRock A770DE+, AMD Athlon II X2 250, 2GB RAM which I gave to my 73 year old mom as *her* old system was a P-3!). I've yet, in all these years of reiserFS use exclusively had a problem with it. I'll keep using it until the day it *does* start to screw up, but in thirteen years this month (so far) it hasn't and it is nice and stable and takes the beatings of power outages and brown-outs extremely well, imho.

I never had any issues with it either.

I've used it on External Drives quite a bit. Never lost data.

Even on crap hardware with dying controllers, it recovered.

I'd use it before I'd ever go back to NTFS.


All times are GMT -5. The time now is 08:21 PM.