LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
LinkBack Search this Thread
Old 02-25-2008, 05:59 PM   #1
SanRamonCA
LQ Newbie
 
Registered: Feb 2008
Posts: 8

Rep: Reputation: 0
Raid1 MDADM, root volume weirdness


I was advised this may be a better place to ask for some assistance:

I have been battling this for months and just not sure what the heck is going on.

FC6 2.6.18, software raid setup.

Everything seems to create okay however my /proc/mdstat shows this

Personalities : [raid1]
md7 : active raid1 sda7[0]
63844992 blocks [2/1] [U_]

md3 : active raid1 sdb3[1] sda3[0]
2007744 blocks [2/2] [UU]

md5 : active raid1 sdb5[1] sda5[0]
4015552 blocks [2/2] [UU]

md6 : active raid1 sdb6[1] sda6[0]
4015552 blocks [2/2] [UU]

unused devices: <none>

dmesg:
for /dev/sdb7

" sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 >
md: could not bd_claim sdb7. <--- NO idea what has it open.
EXT3 FS on sdb7, internal journal"

for /dev//sda7 :
md: considering sda7 ...
md: adding sda7 ...
md: bind<sda7>
md: running: <sda7>

However I can mount /dev/sdb7 to /mnt without any issues, after the system is up.

Here is where it makes no sense to me..

mdadm output from /dev/sda7 and /dev/sdb7. If I query /dev/sda7 it shows that /dev/sdb7 is faulty and that /dev/sda7 is driving the bus.

[ops /]#/sbin/mdadm --query --examine /dev/sda7
/dev/sda7:
Magic : a92b4efc
Version : 00.90.00
UUID : 6ba1f68b:73389d27:109f02d2:e899a2a9
Creation Time : Mon Feb 25 13:36:07 2008
Raid Level : raid1
Device Size : 63844992 (60.89 GiB 65.38 GB)
Array Size : 63844992 (60.89 GiB 65.38 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 7

Update Time : Mon Feb 25 13:36:07 2008
State : active
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
Checksum : 149324d1 - correct
Events : 0.1


Number Major Minor RaidDevice State
this 0 8 7 0 active sync /dev/sda7

0 0 8 7 0 active sync /dev/sda7
1 1 0 0 1 faulty


If I do the same examination on /dev/sdb7 it shows that everything is hunkadory , sda7 is driving, but sdb7 is ready to take over.

[ops /]# /sbin/mdadm --query --examine /dev/sdb7
/dev/sdb7:
Magic : a92b4efc
Version : 00.90.00
UUID : 1a060e52:5465c670:1cd8a237:b28461d5
Creation Time : Mon Feb 25 13:10:16 2008
Raid Level : raid1
Device Size : 63844992 (60.89 GiB 65.38 GB)
Array Size : 63844992 (60.89 GiB 65.38 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 7

Update Time : Mon Feb 25 13:19:27 2008
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 7acc24b6 - correct
Events : 0.3


Number Major Minor RaidDevice State
this 1 8 23 1 active sync /dev/sdb7

0 0 8 7 0 active sync /dev/sda7
1 1 8 23 1 active sync /dev/sdb7

Oh current mounts:

/dev/md7 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/md6 on /logs type ext3 (rw)
/dev/md3 on /tmp type ext3 (rw)
/dev/md5 on /var type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

I can't mount /dev/sda7 since it believes it's active as part of /dev/md7 (it is), but /dev/sdb7 should be as well..

Ideas or more info needed from me?

Another piece of the pie (I think the system has /dev/sda7 and /dev/sdb7 messed up somehow).

currently /dev/md7 is mounted to /, /proc/mdstat shows /dev/sda7 as the only "healthy" disk in the array.

However if I mdadm --stop /dev/md7
mount /dev/sdb7 /mnt
touch /testfile
ls /mnt

I will see the test file where i created it / (which is /dev/md7, which should be /dev/sda7) and I see the file created on /mnt where I mounted /dev/sdb7

Something is wrong and I am having a hard time understanding what the heck. UUID's are all the same between /dev/sda7 /dev/sdb7 and /dev/md7 (not sure if that's causing an issue as well or not)

Thanks
Tory
 
Old 02-26-2008, 11:29 AM   #2
Randall Slack
Member
 
Registered: Feb 2005
Location: Rotterdam, The Netherlands
Distribution: Debian - Ubuntu
Posts: 219

Rep: Reputation: 30
some toughts,

1 ) did you make sure any remaining info from previous raids was cleared?
mdadm --zero-superblock /dev/sda7
mdadm --zero-superblock /dev/sdb7

if you get an error like this it means its OK
mdadm: Unrecognised md component device - /dev/sda7

if you don't, try it again

if it is any help to you i made some notes on my last raid10 installation, which in turn is based on a raid1 debian howto so the general principle should work.

http://songshu.org/doku/doku.php?id=host.cipar.net
 
Old 02-26-2008, 12:51 PM   #3
SanRamonCA
LQ Newbie
 
Registered: Feb 2008
Posts: 8

Original Poster
Rep: Reputation: 0
Thanks Randall

I've not done that specifically but it could be a good place to look as I see:

> /sbin/mdadm --examine --scan /dev/sdb7
ARRAY /dev/md7 level=raid1 num-devices=2 UUID=b15c48a0:5a6bfe87:c2510104:eba2cacc
> /sbin/mdadm --examine --scan /dev/sda7
ARRAY /dev/md7 level=raid1 num-devices=2 UUID=e4e07e8a:0b7b23b7:88c987c9:5ca4ba55

Not sure that I should see 2 different UUID's here.. Hmmm interesting.


I do wipe the drives and repartition, but I'm not changing the partition sizes so it's possible the data is still there?

# Wipe the MBR (Master Boot Record) clean.
dd if=/dev/zero of=$DISK0 bs=512 count=1 || shellout
blockdev --rereadpt $DISK0
parted -s -- $DISK0 mklabel msdos || shellout

Let me try to reinstall and see if that works.

Okay added a zero-superblock step but same issue. (it zero'd all the devices before the md was created)


md1 : active raid1 sdb1[1] sda1[0]
250880 blocks [2/2] [UU]

md7 : active raid1 sda7[0]
63844992 blocks [2/1] [U_] <------

md3 : active raid1 sdb3[1] sda3[0]
2007744 blocks [2/2] [UU]

md5 : active raid1 sdb5[1] sda5[0]
4015552 blocks [2/2] [UU]

md6 : active raid1 sdb6[1] sda6[0]
4015552 blocks [2/2] [UU]

dmesg shows the same thing

md: could not bd_claim sdb7. <-- which from everything i've read cites that /dev/sdb7 is open, but there is nothing I can point to that shows it being open. in fact

[@ops ~]# mount /dev/sdb7 /mnt
[ops ~]# cd /mnt
[@ops mnt]# ls
admin boot dev etc ipix logs logs-dev lost+found misc mounts opt root selinux sys usr
bin data dev760 home lib logs-all logs-qa media mnt net proc sbin srv tmp var

[@ops mnt]# mount
/dev/md7 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/md1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/md6 on /logs type ext3 (rw)
/dev/md3 on /tmp type ext3 (rw)
/dev/md5 on /var type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/dev/sdb7 on /mnt type ext3 (rw)

So why mdadm refuses to allow /dev/sdb7 to play in any raid1 games, is beyond me. I've tried 4 different disks, same stuff, i've tried 2 different chassis (hardware) same. So this is darn odd!

Thanks, it was an attempt for sure.

It really seems that the system has /dev/sdb7 and /dev/sda7 confused. Why?

/sbin/mdadm --create /dev/md7 -l raid1 -n 2 /dev/sdb7 "missing"
mdadm: Cannot open /dev/sdb7: Device or resource busy
mdadm: create aborted

/dev/sdb7 is not mounted or connected to anything, although md feels it is, even after doing a mdadm --stop.

I can mount /dev/sdb7 to /mnt without issues. However mdadm believes it's busy.

really weird and frustrating
 
Old 02-26-2008, 01:08 PM   #4
Randall Slack
Member
 
Registered: Feb 2005
Location: Rotterdam, The Netherlands
Distribution: Debian - Ubuntu
Posts: 219

Rep: Reputation: 30
you are creating the raid on a running / partition right?? maybe thats why its busy.

have a look at my wiki and the link to the original article it was based on

what you need to do is to create a md device without the running disk, copy the contents of the running disk to the md device, boot from the md device and then add the disk missing disk (sda7) to md device....

something like this assuming you are running on sda7

mdadm --create /dev/md7 --level=1 --raid-disks=2 missing /dev/sdb7

cp sda7 to sdb7

then change grub to boot from /dev/md7

reboot

and then add sda7 to the array


but then again i could be completely be missing the point here since im had a few beers already
 
Old 02-26-2008, 01:33 PM   #5
SanRamonCA
LQ Newbie
 
Registered: Feb 2008
Posts: 8

Original Poster
Rep: Reputation: 0
Actually it's being done via an imaging tool (systemimager) so I don't have anything mounted when this stuff is created, nada, nichts, keine.

But why it continues to do funny stuff with just the root volume is beyond me, heck I can mirror swap if I wanted, no issue..

It's ALWAYS the raid array that moans.

Here is the image process

Feb 26 10:20:37 ops root: logger: Load software RAID modules.
Feb 26 10:20:37 ops root: logger: yes | mdadm --zero-superblock /dev/sda[1-7] /dev/sdb[1-7]
Feb 26 10:20:37 ops root: logger: yes | mdadm --create /dev/md1 --auto yes --level raid1 --raid-devices 2 /dev/sda1 /dev/sdb1
Feb 26 10:20:38 ops root: logger: yes | mdadm --create /dev/md3 --auto yes --level raid1 --raid-devices 2 /dev/sda3 /dev/sdb3
Feb 26 10:20:38 ops root: logger: yes | mdadm --create /dev/md5 --auto yes --level raid1 --raid-devices 2 /dev/sda5 /dev/sdb5
Feb 26 10:20:38 ops root: logger: yes | mdadm --create /dev/md6 --auto yes --level raid1 --raid-devices 2 /dev/sda6 /dev/sdb6
Feb 26 10:20:38 ops root: logger: yes | mdadm --create /dev/md7 --auto yes --level raid1 --raid-devices 2 /dev/sda7 /dev/sdb7
Feb 26 10:20:38 ops root: logger: Load device mapper driver (for LVM).
Feb 26 10:20:38 ops root: logger: Load additional filesystem drivers.
Feb 26 10:20:38 ops root: logger: mke2fs -q -j /dev/md7 || shellout

Feb 26 10:21:26 ops root: logger: tune2fs -L / /dev/md7
Feb 26 10:21:27 ops root: logger: mkdir -p /a/ || shellout
Feb 26 10:21:27 ops root: logger: mount /dev/md7 /a/ -t ext3 -o defaults || shellout
Feb 26 10:21:27 ops root: logger: mke2fs -q -j /dev/md1 || shellout
Feb 26 10:21:29 ops root: logger: tune2fs -L /boot /dev/md1
Feb 26 10:21:29 ops root: logger: mkdir -p /a/boot || shellout
Feb 26 10:21:29 ops root: logger: mount /dev/md1 /a/boot -t ext3 -o defaults || shellout
Feb 26 10:21:29 ops root: logger: mke2fs -q -j /dev/md6 || shellout
Feb 26 10:21:34 ops root: logger: tune2fs -L /logs /dev/md6
Feb 26 10:21:34 ops root: logger: mkdir -p /a/logs || shellout
Feb 26 10:21:34 ops root: logger: mount /dev/md6 /a/logs -t ext3 -o defaults || shellout
Feb 26 10:21:34 ops root: logger: mke2fs -q -j /dev/md3 || shellout
Feb 26 10:21:37 ops root: logger: tune2fs -L /tmp /dev/md3
Feb 26 10:21:37 ops root: logger: mkdir -p /a/tmp || shellout
Feb 26 10:21:37 ops root: logger: mount /dev/md3 /a/tmp -t ext3 -o defaults || shellout
Feb 26 10:21:37 ops root: logger: mke2fs -q -j /dev/md5 || shellout
Feb 26 10:21:42 ops root: logger: tune2fs -L /var /dev/md5
Feb 26 10:21:42 ops root: logger: mkdir -p /a/var || shellout
Feb 26 10:21:42 ops root: logger: mount /dev/md5 /a/var -t ext3 -o defaults || shellout
Feb 26 10:21:42 ops root: logger: mkswap -v1 /dev/sda2 || shellout
Feb 26 10:21:42 ops root: logger: swapon /dev/sda2 || shellout
Feb 26 10:21:42 ops root: logger: mkdir -p /a/proc || shellout
Feb 26 10:21:42 ops root: logger: mount proc /a/proc -t proc -o defaults || shellout
Feb 26 10:21:42 ops root: logger: mkdir -p /a/sys || shellout
Feb 26 10:21:42 ops root: logger: mount sysfs /a/sys -t sysfs -o defaults || shellout
Feb 26 10:21:42 ops root: logger: Evaluating image size.
Feb 26 10:22:45 ops root: logger: Report task started.
Feb 26 10:22:45 ops root: logger: Quietly installing image...

As you may be able to see during this process there are no complaints the system is created correctly. It's on boot that one of the raid partitions appears to be in use, why? Is it that it's not loading the MD devices early enough, but than why does the system show md's as mounted.

Something in the boot process appears to be locking /dev/sdb7..
16061-ibm_acpi: ec object not found
16091-md: Autodetecting RAID arrays.
16122:md: could not bd_claim sdb7. <--- something has it open!
16151-md: autorun ...
16167-md: considering sdb6 ...
16192-md: adding sdb6 ...
16213-md: sdb5 has different UUID to sdb6
16249-md: sdb3 has different UUID to sdb6

Again super odd and I'm open to almost anything!

Thanks
 
Old 02-26-2008, 01:36 PM   #6
Randall Slack
Member
 
Registered: Feb 2005
Location: Rotterdam, The Netherlands
Distribution: Debian - Ubuntu
Posts: 219

Rep: Reputation: 30
on boot?

did you do
update-initramfs -u
 
Old 02-26-2008, 01:38 PM   #7
SanRamonCA
LQ Newbie
 
Registered: Feb 2008
Posts: 8

Original Poster
Rep: Reputation: 0
ummm hmmm, no.... Looking into that now
 
Old 02-26-2008, 01:40 PM   #8
Randall Slack
Member
 
Registered: Feb 2005
Location: Rotterdam, The Netherlands
Distribution: Debian - Ubuntu
Posts: 219

Rep: Reputation: 30
or probably not -u since that would mean your running kernel
 
Old 02-26-2008, 01:41 PM   #9
SanRamonCA
LQ Newbie
 
Registered: Feb 2008
Posts: 8

Original Poster
Rep: Reputation: 0
Okay that is an ubuntu (sp) command. But I can create a new mkinitrd possibly to solve this..

hmmm, it's worth a try!
 
Old 02-26-2008, 01:58 PM   #10
Randall Slack
Member
 
Registered: Feb 2005
Location: Rotterdam, The Netherlands
Distribution: Debian - Ubuntu
Posts: 219

Rep: Reputation: 30
what do you have in your grub anyway?
 
Old 02-26-2008, 02:30 PM   #11
SanRamonCA
LQ Newbie
 
Registered: Feb 2008
Posts: 8

Original Poster
Rep: Reputation: 0
I tend to eat most everything..

Oh, you mean my grub.conf

# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/sda8
# initrd /initrd-version.img
#boot=/dev/md0
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title Fedora Core (2.6.18-1.2798.fc6PAE)
root (hd0,0)
kernel /vmlinuz-2.6.18-1.2798.fc6PAE ro root=LABEL=/ console=ttyS0,9600n
8
initrd /initrd-2.6.18-1.2798.fc6PAE.img
 
Old 02-26-2008, 02:37 PM   #12
Randall Slack
Member
 
Registered: Feb 2005
Location: Rotterdam, The Netherlands
Distribution: Debian - Ubuntu
Posts: 219

Rep: Reputation: 30
not 100% sure cause of the difference netween fc and debian

but this what i have
kernel /vmlinuz-2.6.18-5-amd64 root=/dev/md1 ro


mind the root=/dev/md1
 
Old 02-26-2008, 02:57 PM   #13
SanRamonCA
LQ Newbie
 
Registered: Feb 2008
Posts: 8

Original Poster
Rep: Reputation: 0
ya, but I'm using labels , which is basically the same as /dev/md1 but if you e2label my /dev/md1 you get /boot and /dev/md7 you see /

I looked in my initrd.img and don't see anything exciting. I do see a mention of sda6 in the init file, but that doesn't ring any bells since my root partitions are on /dev/sd[a-b]

I'm going to take another shot at seeing if something is coming up at boot and causing /dev/sdb7 to be busy "during boot"

I know when I went into rescue mode, I could add /dev/sdb7 and my mirrors were clean, however once I reboot, /dev/sdb7 shows that it's busy again.

So I think you were headed in the right direction (2 beers ago) when you cited "on boot and have you updated your initrd (initramfs)

So I'm still tracking

Thanks
 
Old 02-26-2008, 03:01 PM   #14
Randall Slack
Member
 
Registered: Feb 2005
Location: Rotterdam, The Netherlands
Distribution: Debian - Ubuntu
Posts: 219

Rep: Reputation: 30
don't take my word for it, i'm just guessing. bit i bet it will be something so simple i will hit my head with one of these http://www.hertogjan.nl/

my i ask why you are using "systemimager" anyway
 
Old 02-26-2008, 04:10 PM   #15
SanRamonCA
LQ Newbie
 
Registered: Feb 2008
Posts: 8

Original Poster
Rep: Reputation: 0
Okay... I owe you a bier..

First the reason I use systemimager, is that I have different hardware that I quickly image using systemimager and it places an almost identical configuration on each server. It's nothing more than powering up a system and having it netboot and it will completely build itself. I even use it for XEN deployment (just finishing up that project)

Okay the good news, thanks to some of your "what about's and have you tried this or that", i appear to be golden

Check it "try to hold back the tears".
Personalities : [raid1]
md1 : active raid1 sdb1[1] sda1[0]
250880 blocks [2/2] [UU]

md3 : active raid1 sdb3[1] sda3[0]
2007744 blocks [2/2] [UU]

md5 : active raid1 sdb5[1] sda5[0]
4015552 blocks [2/2] [UU]

md6 : active raid1 sdb6[1] sda6[0]
4015552 blocks [2/2] [UU]

md7 : active raid1 sdb7[1] sda7[0]
63844992 blocks [2/2] [UU]

unused devices: <none>

So the issue was definately the fact I was trying to use a non raid initrd.img for my new initrd install (remember same image as my systems with no software raid). Obviously trying to use the same initrd.img caused some issues as it was not loading the raid groups early enough.

So rebuilding my initrd from my host, allowed for some added raid steps to be added and for the root filesystem to look for md7 vs sda7

So I've rebooted it and things look lovely. this only took me on and off a few months but it's awesome that we have figured it out.

So anyone that see's "md: could not bd_claim sd*" should look in their initrd.img to make sure it's not trying to mount something as the root filesystem that is unwanted.

Thank you sir, it's most appreciated!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
mdadm assemble weirdness with partitionable RAID0 Schneckl Linux - Software 0 08-28-2007 05:45 PM
mdadm error creating raid1 Ezplan Fedora 7 07-26-2007 10:35 AM
Software RAID1 with mdadm bujecas Debian 0 10-26-2006 09:56 AM
Problems with setting up raid1 with mdadm on slackware kikola Slackware 0 08-21-2006 05:29 AM
raid1 using mdadm? help plz akadidm Linux - Hardware 3 06-09-2005 04:58 AM


All times are GMT -5. The time now is 11:16 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration