LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 07-05-2022, 10:10 AM   #1
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,555

Rep: Reputation: 177Reputation: 177
Clone multiply-claimed blocks


I ran 'fsck.ext4 -f -y /dev/md0' on my RAID-6 array. Why am I getting "Clone multiply-claimed blocks" errors? This fsck has been running for 14 hours, with the last shown "Clone" error for 2202-01-01-pensionFilesFullbackup.tar.bz2 being display 6 hours ago.

Code:
Pass 1D: Reconciling multiply-claimed blocks
(There are 8 inodes containing multiply-claimed blocks.)

File /Backups/MAIL/2021-11-30-MAILfullbackupUSR.tar.bz2 (inode #10657818, mod time Wed Dec
  has 4449 multiply-claimed block(s), shared with 1 file(s):
        ... (inode #437919747, mod time Sat Jul  2 20:10:42 2022)
Clone multiply-claimed blocks? yes

File /Backups/MAIL/2021-11-30-MAILfullbackupSYS.tar.bz2 (inode #10657869, mod time Wed Dec
  has 4252 multiply-claimed block(s), shared with 1 file(s):
        ... (inode #437919747, mod time Sat Jul  2 20:10:42 2022)
Clone multiply-claimed blocks? yes


File /Backups/SQLServerBackup/Quarterly/master/master_backup_20220701201002.bak (inode #1070:03 2022)
  has 996 multiply-claimed block(s), shared with 1 file(s):
        /Backups/public/2022-01-01-publicFullBackup.tar.bz2 (inode #407691271, mod time Sun
Clone multiply-claimed blocks? yes

File /Backups/PensionFiles/2022-01-01-pensionFilesFullBackup.tar.bz2 (inode #12787733, mod
  has 9980 multiply-claimed block(s), shared with 1 file(s):
        ... (inode #437919747, mod time Sat Jul  2 20:10:42 2022)
Clone multiply-claimed blocks? yes

Last edited by mfoley; 07-05-2022 at 02:46 PM. Reason: too much blah, blah detail.
 
Old 07-06-2022, 03:01 PM   #2
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,555

Original Poster
Rep: Reputation: 177Reputation: 177
No one has anything on this, eh? The fsck on the RAID has been running for 37 hours thus far. It this normal?

I'm beginning to wonder if a 4-drive RAID-6 is doing what was intended. A RAID may guard against hardware failures in one of it's members, but a corruption in the filesystem apparently completely negates the RAID benefit. I'm thinking about converting these 4 drives to two RAID-1s, one being the main online device and the other being the target of a periodic rsync to clone the production drive. That way, if the production drive file system get corrupted the mirror could be used. This would be a heck-of-a-lot quicker than fsck'ing for 4 or more days with possible data loss as well. I could have rebuilt this RAID from scratch in less time!

Last edited by mfoley; 07-06-2022 at 03:19 PM.
 
Old 07-06-2022, 10:05 PM   #3
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
RAID has absolutely nothing to do with filesystem corruption or any other cause of data loss or corruption originating from higher up in the software/firmware stack. RAID levels higher than RAID-0 protect against ONE cause of data loss (disk failure). If the OS writes bad metadata to the filesystem, all any level of RAID can do is faithfully record that.

As for the time taken by the fsck, how big is that filesystem? Judging by those large inode numbers, I'd guess it's pretty big.
 
1 members found this post helpful.
Old 07-06-2022, 10:45 PM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,128

Rep: Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121
A single backup is subject to the same frailties as the source. Multiples has always been the answer.

As for fsck - it is designed to ensure the integrity of the filesystem, not specifically the files within. If you have multiply-linked blocks in a tar backup and you don't know which file wrote to those blocks last, the tar is too suspect to be any use. Scrub the lot and restore what you have.
 
1 members found this post helpful.
Old 07-07-2022, 01:27 AM   #5
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,555

Original Poster
Rep: Reputation: 177Reputation: 177
rknichols and syg00: Yes, It is dawning on me that RAID is not a silver bullet against all types of failures as I concluded in my post #2. I did have multiple backups of the production data, which I restored to a different drive, and the office is happily using that w/o problem. I am also backing that up every 20 minutes to both local and offsite storage.

The main other thing this RAID is used for is storing backups going back many years according to retention policy, the most important of which are also on other backups (done quarterly and stored on external USBs in a fireproof safe). So far, as shown by my initial post, the files with multiply claimed block are temporary backups kept for no more than a year. So far fsck has found only 7 such files after 47 hours of running, but more may crop up as the fsck progresses.

Here's another thought: since a RAID-6 can physically lose two of the four members and supposedly not lose data, would pulling two of the drives make fsck go faster?

While I may in the end "Scrub the lot and restore what you have" per syg00's suggestion, my current plan is to go ahead and let it keep grinding through the weekend and see if it completes. Since the affected files are tarfiles, I should be able to 'tar -tv' them and see if they are OK or not and if not, I can go ahead and delete them. My hope is that most of the other tar and zip files on the driver are OK and I can verify all of them in the same way as well.

Then, I'll see about implementing my own suggestion of breaking these up into two sets of RAID-1s and rsync'ing one to the other -- being first sure to do a check on each RAID to make sure the filesystem is w/o errors before copying.

I'll go ahead and leave this thread open for a while and post my progress. If anyone has a better idea than my two-RAID-mirror idea, please speak up!
 
Old 07-07-2022, 08:13 AM   #6
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Quote:
Originally Posted by mfoley View Post
Here's another thought: since a RAID-6 can physically lose two of the four members and supposedly not lose data, would pulling two of the drives make fsck go faster?
No. fsck just checks the one virtual device presented by the RAID driver and is ignorant of the RAID structure, and the RAID driver is just going to satisfy the read from one device. fsck wouldn't even detect mismatch of the mirrored devices unless it happened to get data from the device that was wrong. That type of error is detected by scrubbing the array, not by fsck.

Last edited by rknichols; 07-07-2022 at 08:15 AM.
 
Old 07-08-2022, 03:25 PM   #7
mfoley
Senior Member
 
Registered: Oct 2008
Location: Columbus, Ohio USA
Distribution: Slackware
Posts: 2,555

Original Poster
Rep: Reputation: 177Reputation: 177
The fsck finally did finish at 3:00am yesterday, so about 48 hours to run this. Only 4 files were affected by the "Multiply-claim blocks error. I've removed those. They were temporary, short term backup files, so no big deal. I ran fsck again, just to be sure. I've done 'unzip -t' and am now running 'tar -tf' on all the rest of the backup files on that drive to be as sure as possible that everything else is OK.

When that's done, I intend to convert the RAID-6 to RAID-5 (note that I've changed my mind from having two RAID-2s). Here's what I propose and here's where I could use some expert LQ feedback:
Code:
mdadm --fail /dev/md0 /dev/sdd1
mdadm --remove /dev/md0 /dev/sdd1
mdadm --grow /dev/md0 --level=raid5 --raid-devices=3 --backup-file=/root/mdadm-backupfile
I have 4 hot-swap bays with four, 4TB drives.

I'm planning on removing sdd1 for two reasons:

1) to insure that mdadm builds the RAID-5 with sda1, sdb1, and sdc1. The --grow examples I've found do not specify the physical drives so I don't know if mdadm will just pick the 3-of-4 in alphabetic order, or random.

2) Freeing up the 4th bay will let me put an 8TB drive in that bay and I'll do an rsync backup of the RAID-6 to this drive before converting to RAID-5.

Does this seem reasonable?

1 week later ...

The grow finally finished after 6 days! I now have a 8TB RAID-5 with 3 disks. The 4th bay is an 8TB normal drive. I back up the RAID to the backup drive twice daily. I think I've finally got "belt and suspenders". The last thing I needed to do was convert the RAID file system to ext4. For whatever reason it was ext2. I did the following:
Code:
umount /mnt/md0
tune2fs -O has_journal,dir_index,filetype,extent,flex_bg,sparse_super,large_file,uninit_bg,dir_nlink,extra_isize /dev/md0
fsck.ext4 -f /dev/md0
Everything works. I hope this info proves useful to someone.

Last edited by mfoley; 07-16-2022 at 12:35 PM. Reason: update
 
  


Reply

Tags
fsck ext4 ohshit, raid6



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Open-spec SBC is a clone of a clone of a clone of a Raspberry Pi 3 LXer Syndicated Linux News 0 04-25-2018 04:56 PM
FSCK Errors - Multiply Claimed Blocks. steveoelliott Linux - General 10 10-16-2013 06:53 AM
FSCK multiply-claimed blocks cbtshare Linux - Server 1 04-01-2010 06:51 AM
RAID1 array down to one disk with bad blocks, clone to good disk with dd noerror? ewolf Linux - Server 2 05-10-2008 12:40 AM
DCA-510 (Siemens USB Data Cable) - not claimed by any active driver. mush Linux - Hardware 4 08-05-2004 12:23 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 05:17 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration