LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   software raid: adding "write-intent bitmap" overwrites data on raid device (https://www.linuxquestions.org/questions/linux-software-2/software-raid-adding-write-intent-bitmap-overwrites-data-on-raid-device-4175454470/)

wijte 03-17-2013 10:33 PM

software raid: adding "write-intent bitmap" overwrites data on raid device
 
I tried to add a write-intent bitmap to an existing software raid 5 array. It didn't work; instead, part of the LVM meta-data was overwritten with what looks like the write-intent bitmap.

Fortunately, the data could be restored from the LVM backup files. However, I would really like to get to the bottom of why this happened. It is locally reproducible, but I could not yet reproduce this in a virtual machine, or with loopback devices. Anyone with ideas about why it happened or how it can be debugged? Any help will be very much appreciated.

System information:
Fedora 17 x86_64
kernel 3.7.9-104
mdadm 3.2.6-7

Symptoms:
Code:

$ mdadm --grow --bitmap=internal --bitmap-chunk=512M /dev/md/bulk
mdadm: failed to set internal bitmap.
$ dmesg | tail
kernel: [  543.768771] created bitmap (2 pages) for device md124
kernel: [  543.768778] md124: bitmap file is out of date, doing full recovery
kernel: [  543.778904] md124: bitmap initialisation failed: -5
$ dd if=/dev/md/bulk bs=16 count=16 | hexdump -C
00000000  62 69 74 6d 04 00 00 00  4d 11 94 a9 39 85 3d 14  |bitm....M...9.=.|
00000010  d6 98 2e ab ed 21 a5 1d  00 00 00 00 00 00 00 00  |.....!..........|
00000020  00 00 00 00 00 00 00 00  00 fc 01 00 00 00 00 00  |................|
00000030  00 00 00 00 00 00 10 00  05 00 00 00 00 00 00 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000100  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00000200

Edit (removed all debugging output and speculation):
This occurs because the raid5 array was converted from a raid0 array. The raid0 array does not leave enough room to add the bitmap, because a raid0 array cannot have one. When checking data location:
Code:

$ mdadm --examine /dev/sdb1 | grep Offset
    Data Offset : 16 sectors
  Super Offset : 8 sectors

The mdadm code will put the bitmap 4KiB from the superblock, which is where the real data starts.

As a workaround, you can fail and remove the affected drive(s) from the array, wipe the superblock and add them back again. The new data offset should leave enough room for the bitmap.

The following script reproduces the problem using loopback devices.
Code:

#!/usr/bin/bash

# Created: March 18, 2013
# Author: Lars Wijtemans <lars23091019 - gmail.com>

# Use at your own risk

# This script demonstrates mdadm overwriting user data
# when adding a write-intent bitmap to an array that was
# converted from raid0, leaving 4KiB between the start
# of the superblock and the start of the user data.

# Tested on Fedora 17 x86_64, kernel 3.7.9-104, mdadm 3.2.6-7

RAIDPREFIX="test"
WORKDIR="/tmp"
DEVSIZE="8" # in MB. Script will use 3*DEVSIZE

cd "$WORKDIR"

# Sanity checks
if [ -e "/dev/md/$RAIDPREFIX-five" ]; then
  echo "Raid device $RAIDPREFIX-five exists, stopping"
  exit
fi

echo "Please read the script before executing"
echo "Use at your own risk, you can Ctrl-C now"
read -p "Test workaround? [y/n]"
WORKAROUND=$REPLY

# Create loopback devices
for i in 0 1 2; do
  if [ -e "disk$i.img" ]; then
    echo "File disk$i.img exists, stopping"
    if [ $i -eq 0 ]; then
      # Nothing to clean up
      exit
    fi
    # Remove created files
    LAST=$(( $i - 1 ))
    for (( d=0; d<=$LAST; d++ )); do
      sudo losetup -d "${DISK[$d]}"
      rm "disk$d.img"
    done
    exit
  fi
  COUNT=$DEVSIZE
  dd if=/dev/zero of="disk$i.img" bs=1M count=$COUNT
  DISK[$i]=$(sudo losetup -f --show "disk$i.img")
done


# Create raid device as raid0
sudo mdadm --create "/dev/md/$RAIDPREFIX-five" --level=0 \
--raid-devices=2 "${DISK[0]}" "${DISK[1]}"

# Convert it to raid5
sudo mdadm --grow "/dev/md/$RAIDPREFIX-five" --level=5 \
--raid-devices=3 --add "${DISK[2]}"

sudo mdadm --wait "/dev/md/$RAIDPREFIX-five"


# Workaround
if [[ $WORKAROUND =~ ^[Yy]$ ]]; then
  echo "Applying workaround"
  # Re-add first disk from previous raid0
  sudo mdadm "/dev/md/$RAIDPREFIX-five" --fail "${DISK[0]}"
  sudo mdadm "/dev/md/$RAIDPREFIX-five" --remove "${DISK[0]}"
  sudo mdadm --zero-superblock "${DISK[0]}"
  sudo mdadm "/dev/md/$RAIDPREFIX-five" --add "${DISK[0]}"
  sudo mdadm --wait "/dev/md/$RAIDPREFIX-five"
  # Second disk
  sudo mdadm "/dev/md/$RAIDPREFIX-five" --fail "${DISK[1]}"
  sudo mdadm "/dev/md/$RAIDPREFIX-five" --remove "${DISK[1]}"
  sudo mdadm --zero-superblock "${DISK[1]}"
  sudo mdadm "/dev/md/$RAIDPREFIX-five" --add "${DISK[1]}"
  sudo mdadm --wait "/dev/md/$RAIDPREFIX-five"
fi


# Put dummy data on raid device
echo "Writing random data"
sudo dd if=/dev/urandom of="/dev/md/$RAIDPREFIX-five" bs=1M 2>/dev/null
echo -e "\nArray device contains:"
sudo dd if="/dev/md/$RAIDPREFIX-five" bs=16 count=3 2>/dev/null | hexdump -C


# Add write-intent bitmap
sudo mdadm --grow --bitmap=internal --bitmap-chunk=1M \
"/dev/md/$RAIDPREFIX-five"


# Check the data
echo -e "\nArray device contains:"
if [[ $WORKAROUND =~ ^[Yy]$ ]]; then
  TESTSIZE=3
else
  TESTSIZE=35
fi
sudo dd if="/dev/md/$RAIDPREFIX-five" bs=16 count=$TESTSIZE 2>/dev/null \
| hexdump -C


echo "press return to continue with cleanup"
read

# Clean up
sudo mdadm --stop "/dev/md/$RAIDPREFIX-five"

for i in 0 1 2; do
  sudo losetup -d "${DISK[$i]}"
  rm "disk$i.img"
done


corp769 03-27-2013 02:34 PM

Nice work, and good write-up. Thanks for the effort! (Plus replying to get this off the zero reply list :p)

Cheers,

Josh


All times are GMT -5. The time now is 02:50 AM.