[SOLVED] raid arrays not assembling

BearTom · 10-18-2019, 12:08 PM

My raid arrays stopped assembling. I have a nested configuration of:

/dev/sda1 + /dev/sdb1 in raid0 > /dev/md132
/dev/sde1 + /dev/sdc1 in raid0 > /dev/md131
/dev/md132 + /dev/md131 + /dev/sdd1 in raid5 > /dev/md128.

Both raid0 arrays refuse to assemble, and (I guess consequently,) /dev/md128 also failes.

The arrays worked fine for weeks before the current issue. I am on Fedora 30 and yesterday I did a dnf update, and did a manual shutdown (with the shutdown command).

Some mdadm diagnostic output:

Code:

[root@piglet ~]# mdadm --detail /dev/md132
/dev/md132:
           Version : 1.2
     Creation Time : Tue Sep 10 18:13:21 2019
        Raid Level : raid0
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Tue Sep 10 18:13:21 2019
             State : active, FAILED, Not Started 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

        Chunk Size : 512K

Consistency Policy : unknown

              Name : piglet:132  (local to host piglet)
              UUID : 360771e4:018d3478:81883fa3:a6b5f578
            Events : 0

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       -       0        0        1      removed

       -       8        1        0      sync   /dev/sda1
       -       8       17        1      sync   /dev/sdb1

Code:

mdadm --detail /dev/md131
/dev/md131:
           Version : 1.2
     Creation Time : Wed Jun  7 19:38:24 2017
        Raid Level : raid0
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Wed Jun  7 19:38:24 2017
             State : active, FAILED, Not Started 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

        Chunk Size : 512K

Consistency Policy : unknown

              Name : piglet:131  (local to host piglet)
              UUID : 1dea08ea:326b7b82:1430d1bc:1c2fac1c
            Events : 0

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       -       0        0        1      removed

       -       8       65        0      sync   /dev/sde1
       -       8       33        1      sync   /dev/sdc1

Code:

[root@piglet ~]# mdadm --detail /dev/md128
/dev/md128:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 1
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 1

              Name : piglet:128  (local to host piglet)
              UUID : 1a4e32eb:1cd1a4bf:122e69cc:c9e996c9
            Events : 68089

    Number   Major   Minor   RaidDevice

       -       8       49        -        /dev/sdd1

Code:

[root@piglet ~]# mdadm --examine /dev/sda1
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 360771e4:018d3478:81883fa3:a6b5f578
           Name : piglet:132  (local to host piglet)
  Creation Time : Tue Sep 10 18:13:21 2019
     Raid Level : raid0
   Raid Devices : 2

 Avail Dev Size : 3906762752 (1862.89 GiB 2000.26 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : 01d01cd7:06904ff9:313e8481:579b89e1

    Update Time : Tue Sep 10 18:13:21 2019
  Bad Block Log : 512 entries available at offset 8 sectors
       Checksum : 96fb86ce - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)

Code:

[root@piglet ~]# mdadm --examine /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 360771e4:018d3478:81883fa3:a6b5f578
           Name : piglet:132  (local to host piglet)
  Creation Time : Tue Sep 10 18:13:21 2019
     Raid Level : raid0
   Raid Devices : 2

 Avail Dev Size : 1953257472 (931.39 GiB 1000.07 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : 330150e2:948d0372:5211b844:1b9a5da0

    Update Time : Tue Sep 10 18:13:21 2019
  Bad Block Log : 512 entries available at offset 8 sectors
       Checksum : 287a6773 - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)

I am skipping the other outputs for mdadm --examine /dev/sd*1, because I think the problems should have a common root, and this information should be enough.

Stopping and assembling the raid0 array gives an error:

Code:

[root@piglet ~]# mdadm --stop /dev/md132
mdadm: stopped /dev/md132
[root@piglet ~]# mdadm --assemble /dev/md132
mdadm: failed to RUN_ARRAY /dev/md132: Unknown error 524

I found this https://github.com/torvalds/linux/co...57bd3932f39ac9 , but I am not sure whether this applies to me, or how to test.

Since two raid0 devices broke at the same time, I would guess that the cause is either a software problem (driver?), or caused by the update, or the manual shutdown.

Any help on how to get the arrays running again, or how to diagnose is very much appreciated.

BearTom · 10-19-2019, 04:31 AM

I found the source of the problem, but still have questions.

Code:

dmesg -T |grep raid0
[Sat Oct 19 10:03:27 2019] md/raid0:md131: cannot assemble multi-zone RAID0 with default_layout setting
[Sat Oct 19 10:03:27 2019] md/raid0: please set raid.default_layout to 1 or 2
[Sat Oct 19 10:03:27 2019] md/raid0:md132: cannot assemble multi-zone RAID0 with default_layout setting
[Sat Oct 19 10:03:27 2019] md/raid0: please set raid.default_layout to 1 or 2

This means the cause is indeed the kernel patch (link in previous post), that disables assembling when it is not explicitly set what layout the array should use. Apparently it is technically not possible to automatatically detect the layout (for disassembled arrays?), but it is beyond my understanding why the update cannot automatically detect the layout in systems where the array was working fine (so set the parameter before applying the patch). When working, the driver is using a specific layout, right? And the kernel version is also known. This way of patching will make, or already has made many systems unbootable.

To make things worse, the documentation of how and where to set the layout is confusing:

- First, in contrast with the error message, the kernel parameter should be raid0.default_layout (note the 0), not raid.default_layout. I think this will be corrected in future updates.
- Second, the description of the patch mentions values of 0 and 1, but the possible parameter values have been changed to 0, 1, and 2, where 0 means 'not set' (so array will not assemble), but the meaning of 1 and 2 is not properly documented.

The best description I found is here https://blog.icod.de/2019/10/10/caut...efault_layout/

This mentions parameter 2 referring to as new (kernel 3.14 and later), and 1 to the old layout (kernel 3.13 and older). However, on Reddit the opposite is suggested https://www.reddit.com/r/linuxquesti...efault_layout/ . The documentation of the linux kernel doesn't even mention the parameter raid0.default_layout https://www.kernel.org/doc/html/late...=raid%20layout .
The only reference that sort-of confirms the 1=old, 2=new (multi-zone), is a patch snippet in mail https://marc.info/?l=linux-raid&m=156804178826466&w=2 .

Code:

[snip]

> +enum r0layout {
> +	RAID0_ORIG_LAYOUT = 1,
> +	RAID0_ALT_MULTIZONE_LAYOUT = 2

I also read a suggestion that it should be possible to set the parameter to 1 or 2 (randomly), assemble the system with --readonly, mounting it and seeing whether all files are there (use at your own risk). However, I cannot do that because md131 and md132 are parts of a raid5 array, and I am not sure how I could test them individually.

My plan is to set the paramters and assemble all devices in readonly mode first. Hopefully that provides enough safety for the case that the wrong parameter is used.

The plan:
1. Set the parameter to 2 (as that is more likely to be correct)

Code:

echo 2 > /sys/module/raid0/parameters/default_layout

2. Try to assemble the array /dev/md132

Code:

mdadm --assemble --readonly /dev/md132

3. Examine the resulting device

Code:

mdadm --examine /dev/md132

Though, I am not sure how convincing it will be, even if the info seems correct.

4. If step 3. seems fine do steps 2. and 3. for /dev/md131 too.

5. Try to assamble the raid5 array and check details

Code:

mdadm --assemble --readonly /dev/md128
mdadm --detail /dev/md128

6. Mount /dev/md128 (also read-only) and see whether the files look normal. It's a 6TB device, so a full check will be impossible.

7. Manually iniate a raid-check on /dev/md128. Is that possible for read-only devices? I only need a confirmation that all raid5 devices are in sync. If they are not in sync, I don't want it to be corrected (changed), but probably I should assemble the raid0 devices with parameter 1.

8. If any of the steps 2-7 fails, then maybe try parameter 1 (depending on the nature of the failure).

9. If everything went fine, make the paremeter setting permanent. Make documentation about the correct parameter value, because if the system crashes and the arrays are moved to a new system, the question will come back. (And in my opinion the layout should be stored in the meta-data of the array.)

Before I execute the plan, I would highly appreciate an expert opinion on whether this would be a safe and a working plan.

syg00 · 10-19-2019, 06:44 AM

[caveat]not an expert, but I feel your pain[/caveat]

However I'm not short of opinions, so here's a few:
- kudos for your persistence
- that's amongst the whackiest RAID constructions I could imagine
- RAID0 is just asking for trouble
- data not backed up is by definition not worth the trouble
- whoever let this patch through needs their arse kicked forever
- few (none ?) of us will be able to offer relevant advice without direct experience.
- have you tried booting from one of the prior kernels Fedora maintains by default

If it t'was me I'd take an image of all the partitions and do all your fiddling from another system using the images. I keep old systems for just this sort of circumstance - doesn't need to be latest-and-greatest. Even a liveCD would probably work, but you'd need to do the dnf update.

BearTom · 10-19-2019, 09:41 AM

Thanks for the response.

Quote:

Originally Posted by syg00

- kudos for your persistence

Thanks.

Actually, I am writing here in this much detail, because I expect that many people will experience the same problem, and will find that the information that can be currently found is lacking. Hopefully, more helpful information will be found here, eventually.

Quote:

- that's amongst the whackiest RAID constructions I could imagine
- RAID0 is just asking for trouble
- data not backed up is by definition not worth the trouble

Those are different discussions. For me it provides a meaningful trade-off between disk capacity and risk. The raid array in question is for me an intermediate backup step between files in daily use and an off-line backup.

Quote:

- whoever let this patch through needs their arse kicked forever

Yeah ... I think it was a mistake with quite some impact. At the same time these are probably the people who normally contribute to the proper working of our systems.

Quote:

- few (none ?) of us will be able to offer relevant advice without direct experience.

The main question are:
- Whether the --readonly option provides enough protection to prevent ruining the array if the wrong parameter is chosen.
- How the resulting array can be verified (through sync/scrub) in read-only mode.

I would expect that people who know more about the internals of mdadm can say something about this, even if not explicitly tested in relation with this patch.

Quote:

- have you tried booting from one of the prior kernels Fedora maintains by default

I haven't. I could, and it would probably work, but then what? It did work with the kernel before my last update, so I know that already, but I don't know how to benefit from that knowledge.

Quote:

If it t'was me I'd take an image of all the partitions and do all your fiddling from another system using the images. I keep old systems for just this sort of circumstance - doesn't need to be latest-and-greatest. Even a liveCD would probably work, but you'd need to do the dnf update.

I can't take an image, because the disks are too large, and I simply don't have a spare set of disks of this size lying around (array size is 6TB).

Though, I could test with creating another raid0 array while raid0.default_layout=1, stop the array, change the parameter to 0, and try to assemble it (which will fail), and change the parameter to 2 and try to assemble it and test it. Then change the parameter to 0 etc. and then to 1 and test it again. I have a VM for these kind of tests. I will need a bit of time for that. I will report back. Thanks for the idea.

BearTom · 10-19-2019, 04:35 PM

I did my tests, but it confused me in a new way.

My purpose was to verify that setting the raid0.default_layout parameter to the wrong value and assembling the array with the --readonly option will do no harm to the array (so it can be re-assembled with the correct setting later), except that assembling with the wrong parameter value will lead to some sort of failure. However, the array seems to work even with the wrong parameter.

This is what I did:

1. Create two partitions with *different* sizes: /dev/sdb1 (200MiB), /dev/sdb2 (400MiB)

2. Create array

Code:

mdadm --create /dev/md0 --level=raid0 --raid-devices=2 /dev/sdb1 /dev/sdb2
mdadm: Defaulting to version 1.2 metadata
mdadm: RUN_ARRAY failed: Unknown error 524

Okay, so I probably need to set the kernel parameter first. Error message could have been more descriptive.

3. Zero superblocks, and start over:

Code:

mdadm --zero-superblock /dev/sdb1
mdadm --zero-superblock /dev/sdb2

echo 1 > /sys/module/raid0/parameters/default_layout

mdadm --create /dev/md0 --level=raid0 --raid-devices=2 /dev/sdb1 /dev/sdb2
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

4. Save configuration

Code:

mdadm --detail --scan > /etc/mdadm.conf

5. Make ext4 filesystem

Code:

mkfs.ext4 /dev/md0 
mke2fs 1.44.6 (5-Mar-2019)
Creating filesystem with 523264 4k blocks and 130816 inodes
Filesystem UUID: fc54fa2f-c9cb-4467-b3a0-08c60be6ae5d
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

6. Mount device on a dir named test:

Code:

mount /dev/md0 test

7. Generate random files to fill up the whole device. A copy of the files will be stored on another device in folder test_copy, and will be used to check the devices after re-assembling.

Code:

for i in {1..600}; do dd if=/dev/urandom bs=1M count=1 of=test/file$i; done
...
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0145325 s, 72.2 MB/s
...
dd: failed to open 'test/file558': No space left on device

cp -rv test/* test_copy

8. Check for difference, just to be sure that the initial state is correct.

Code:

diff -rq test test_copy

(no output means they are identical)

Okay, now let's try to ruin the array by setting the parameter different than 1 (as set before the array creation).

9. Unmount and stop the device

Code:

umount /dev/md0

mdadm --stop /dev/md0 
mdadm: stopped /dev/md0

10. Unset default_layout parameter and try to assemble

Code:

echo 0 > /sys/module/raid0/parameters/default_layout

mdadm --assemble --readonly /dev/md0
mdadm: failed to RUN_ARRAY /dev/md0: Unknown error 524

dmesg -T |grep raid0
Sat Oct 19 22:27:28 2019] md/raid0:md0: cannot assemble multi-zone RAID0 with default_layout setting
[Sat Oct 19 22:27:28 2019] md/raid0: please set raid.default_layout to 1 or 2

So, this correctly refuses to assemble, because the default_layout has not been specified.

11. Now set it to 2.

Code:

echo 2 > /sys/module/raid0/parameters/default_layout 
mdadm --assemble --readonly /dev/md0
mdadm: /dev/md0 has been started with 2 drives.

It assembles,

12. but doest it work?

Code:

mount /dev/md0 test
mount: /mnt/test: WARNING: device write-protected, mounted read-only.

diff -r test test_copy

No difference found, so the files are identical. This confuses me, because I though a wrong parameter would imply ruining the array. I also tried the other way around, setting the parameter to 2 before creation and stop the array, set the parameter to 1 and assemble the array, and also that seemed to lead to a working array, which is also unexpected.

I don't know what is going on. Or is my testing procedure wrong? Maybe my way of creating the raid0 array is not complicated enough? I also did a similar test with three devices of different sizes (100M, 200M, 300M), and that lead to the same result.

I thought the patch makes systems unbootable, because manual intervention and sysadmin's wisdom is needed to select the only right value for the parameter (and a wrong value may ruin the data forever), and now it looks like that the value doesn't matter?

syg00 · 10-20-2019, 03:34 AM

The link says the situation can lead to corruption, not that it (always) will.
If this is as you say merely transient data, erase the array, and rebuild it from scratch on latest kernel. Get on with life. Just hope they don't revert the change at some future time.
Yes, you (we) will remain in ignorance, but is it really worth the angst to find out ?.

BearTom · 10-29-2019, 05:48 PM

I re-assembled the array with the default_layout parameter set to 2. Comparing with a recent backup showed that the data is fine.

Just re-creating the raid0 array, without understanding the meaning of the parameters, makes no sense. A choice has to be made between 1 and 2 otherwise the array will not even be created. You better know why you chose the red or the blue pill.

Please note that on the kernel raid mailing list there are messages asking for better documentation. I am marking this issue as solved, because I don't expect more info here.

Farrow · 01-09-2020, 10:41 AM

Ran into the same issue after upgrading to Fedora 31 and having a disk crash just afterwards

Tried to recreate a raid0 array with a new drive as part of the array and boom same issue as BearTom. Just wanted to thank BearTom for his research in this matter.

Set my raid0.default_layout to 2 and rebuilt my grub2 config and away she went.

Thanks again BearTom.