Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hi, I have a fedora35 system that uses rsync to operate as a backup server. It has a 8TB RAID5 array, and for the last few days, has crashed/segfaulted in what appears to be the same time that it starts to backup a particular remote host. This indicates to me that perhaps the there is some spot on the disk that is related to this particular host's data that is triggering this.
When it happens, there is a segfault message on the console, but nothing related to it in the logs. There are bits from the kernel about being unable to write prior to the crash, however:
Code:
Aug 1 12:24:32 mail03 kernel: [2415225.412978] EXT4-fs warning (device md2): ext4_end_bio:343: I/O error 10 writing to inode 232141206 starting block 3033088)
Aug 1 12:24:32 mail03 kernel: [2415225.412987] Buffer I/O error on device md2, logical block 3033088
Aug 1 12:24:32 mail03 kernel: [2415225.413025] Buffer I/O error on device md2, logical block 3033089
...
Aug 1 12:24:32 mail03 kernel: [2415225.526007] JBD2: Detected IO errors while flushing file data on md2-8
Aug 1 12:24:35 mail03 kernel: [2415227.560338] JBD2: Detected IO errors while flushing file data on md2-8
How do I identify which of the four disks this is? I've run smartctl short checks on each disk in the array, but all four passed without error. What is md2-8?
RAID is designed to handle disk failures - it is its primary and basic functionality. In all my experience with md disk errors a) where logged properly including physical device names and b) never caused system to fail, if it happens it either indicates a major bug or - and much more likely, something else is going on, especially since, as I understand your message, RAID is a plain data volume with root and swap filesystems somewhere else. What exactly is the segfault message?
RAID is designed to handle disk failures - it is its primary and basic functionality. In all my experience with md disk errors a) where logged properly including physical device names and b) never caused system to fail, if it happens it either indicates a major bug or - and much more likely, something else is going on, especially since, as I understand your message, RAID is a plain data volume with root and swap filesystems somewhere else. What exactly is the segfault message?
Yes, your assumptions are correct. Root and swap and home are on different partitions.
Not enough of the segfault message on the screen is visible enough to really get an idea of what caused it to fail, and there is never any info in the logs.
I've also run it through 4 hours of memtest, so I don't think it's a hardware/CPU/mem problem.
I'm going to start the backups again and see if it still fails in the same spot.
Okay, I restarted the backup that I know appears to write to the area that causes the panic, and within 15 minutes of running, it produced this on th escreen. There's no ability to scroll up or shift-pgup that I normally do, to see the top of the message. I'm not sure how helpful this is.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.