LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Problem creating new mdadm raid 1 array (https://www.linuxquestions.org/questions/linux-software-2/problem-creating-new-mdadm-raid-1-array-866088/)

jlowry 03-02-2011 05:34 PM

Problem creating new mdadm raid 1 array
 
hello all;

Background:

I have a server that was running a hardware isw raid on the system (root) disk. This was working just fine until I started getting sector errors on one of the disks. So, I shutdown the system and removed the failing drive and installed a new drive (same size). On reboot I went in to the intel raid setup and it did show the new drive and I was able to set it to rebuild the raid. So, continuing the reboot everything came up just fine except the raid 1 on the system disk. I have tried many times to get the system to rebuild the raid using dmraid, but to no avail it would not start a rebuild. In order to get the system back up and make sure that the disk was duplicated I was able to 'dd' the working disk to the new disk that was installed.
At present when I look at the system it does not show up with a raid setup on the system disk ( this comprises the entire 1TB disk with w partitions sda1 as / and sda2 as swap).

Problem:
I have decided to forego the intel raid and just use mdadm. I have a test system setup to duplicate (not the software, but the disk partitions) the server setup.

Code:

[root@kilchis etc]# fdisk -l

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

  Device Boot      Start        End      Blocks  Id  System
/dev/sda1  *          1      121079  972567036  83  Linux
/dev/sda2          121080      121601    4192965  82  Linux swap / Solaris

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

  Device Boot      Start        End      Blocks  Id  System
/dev/sdb1  *          1      121079  972567036  83  Linux
/dev/sdb2          121080      121601    4192965  82  Linux swap / Solaris

root@kilchis sysconfig]# cat /proc/mdstat
Personalities :
unused devices: <none>

[root@kilchis sysconfig]# df
Filesystem          1K-blocks      Used Available Use% Mounted on
/dev/sda1            942106768  3284652 890193768  1% /
tmpfs                  1029620        0  1029620  0% /dev/shm

[root@kilchis sysconfig]# swapon -s
Filename                                Type            Size    Used    Priority
/dev/sda2                              partition      4192956 0      -1

When I try to create a raid 1 setup using sda1 and sdb1 it fails with "Device or resource busy"

Code:

[root@kilchis sysconfig]# mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
mdadm: Cannot open /dev/sda1: Device or resource busy
mdadm: create aborted

I get this same error if I try to create a raid 1 using sdb1 and try to add the already running disk.

Code:

[root@kilchis sysconfig]# mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sdb1 missing
mdadm: /dev/sdb1 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Wed Mar  2 03:11:11 2011
Continue creating array? mdadm --assemble /dev/md0
Continue creating array? (y/n) y
mdadm: array /dev/md1 started.
[root@kilchis sysconfig]# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb1[0]
      972566912 blocks [2/1] [U_]
     
unused devices: <none>
[root@kilchis sysconfig]# mdadm --manage  /dev/md1 --add /dev/sda1
mdadm: Cannot open /dev/sda1: Device or resource busy

The big question is: Why will mdadm not let me create a raid using the running system disk? And how can I get around this?

Any help is much appreciated!!

macemoneta 03-02-2011 05:47 PM

Instructions here. It was the first hit when I Google'd.

jlowry 03-02-2011 06:05 PM

Been down that road... it fails
 
Quote:

Originally Posted by macemoneta (Post 4277014)
Instructions here. It was the first hit when I Google'd.

If you notice the mdadm create did not complete. I have done the create a number of times on my test system. It always comes back with the device or resouce busy.

good link though, I will keep it in mind. I have not gotten to the point where I need to setup grub.

thanks

macemoneta 03-02-2011 06:17 PM

I don't think you understand the sequence of actions required, as described at the link I gave you.

You have to:

- Create the array on the single unused drive
- migrate the data from the existing system drive to the single-drive array
- Configure the array to boot
- Reboot onto the array
- Add the original system drive to the array

jlinkels 03-02-2011 06:21 PM

You can't create a RAID array out of two disks which happen to be equal. In addition, I don't know what happened when you tried to put the disk back on the Intel RAID. It is either hardware RAID, BIOS (fake) RAID or mdadm RAID.

BIOS RAID usually doesn't work with Linux (but you'll discover that only when you loose a disk or try to do something with the array).

If you want to switch to mdadm RAID, disassemble the system down to a single disk and recreate the array. The sequence is like this:
1. Get a working system on one disk (the primary).
2. Clean the secondary disk, partition it.
3. Create a degraded array on the secondary disk
4. Copy the working system from primary on secondary
5. Clean the primary, and add it to the degraded array as to make the array complete.

An excellent guide: http://www200.pair.com/mecham/raid/r...aded-etch.html This is not debian specific.

jlinkels

jlowry 03-02-2011 06:26 PM

Then I have questions????/
 
Quote:

Originally Posted by macemoneta (Post 4277033)
I don't think you understand the sequence of actions required, as described at the link I gave you.

You have to:

- Create the array on the single unused drive
- migrate the data from the existing system drive to the single-drive array
- Configure the array to boot
- Reboot onto the array
- Add the original system drive to the array

Why do you have to migrate the data when building the raid should migrate the data automatically? At least that is the way raid worked in the past. Or is this just how software raid works?

If that is how it works, then your are right I did not understand the sequence that it requires. As a side note that would be something I would think would be in the man pages or Linux Documents. (at least I didn't read it there)

thanks

jlinkels 03-02-2011 06:38 PM

You have to copy the data from the non-RAIDED partition to the RAIDed partition because you don't have RAID yet. After that and you completed the array data will copy automatically yes.

It is in the Linux documentations. macemoneta and I gave you the links.

jlinkels

jlowry 03-03-2011 04:39 PM

Still having problems with booting
 
Okay, so I followed the procedure in the link. I only have one partition so it is easy to work with. When I reboot I get the following error:

Kernel Panic - not syncing: VFS: Unable to mount root fs on unknown-block(9,0)

I added the grub entries just as they are in the link. And then copied them to the working drive.

Anyone have any ideas on this?

menu.lst - boot lines
Code:

(original boot line pointedt to root=/LABEL1 /)
root (hd0,0)
kernel /boot/vmlinuz-2.6.18-164.el5 root=/dev/md0 md=0,/dev/sda1,/dev/sdb1 ro quiet
initrd /boot/initrd-2.6.18-164.el5.img

(new boot lines for raid)
root (hd0,0)
kernel /boot/vmlinuz-2.6.18-164.el5 root=/dev/md0 md=0,/dev/sda1,/dev/sdb1 ro quiet
boot
(mirror recovery)
root (hd1,0)
kernel /boot/vmlinuz-2.6.18-164.el5 root=/dev/md0 md=0,/dev/sdb1 ro quiet
boot

thanks

macemoneta 03-03-2011 05:20 PM

Just to be clear, you set your BIOS to boot the new RAID drive, and the grub configuration is the same on both drives? Which grub entry did you boot?

jlowry 03-04-2011 11:38 AM

Raid boot problem
 
Well after checking the BIOS, it is setup to boot the first disk.

The grub configuration is setup to boot the first (1) option in grub which would point to both disks in the raid.

I have tried to boot each of the entries and they all fail. When the BIOS is pointing to the first disk.

-------
(next day)

Changed the BIOS so that it is looking at the second disk as the boot disk.
Selected the grub entry (1) for both disks in the raid. It gets further, but still fails looking for /dev/hda.

Setting up Logical Volume Management: /dev/hda: open failed: no medium found
Checking Filesystems
fsck.ext3: Invalid argument while trying to open /dev/md0
kernel direct mapping tables up to 100000000 @ 10000-15000 [Failed]

-----

I get the same error when booting the original grub entry pointing to the first disk.

-----

When I try to boot entry (2) pointing to /dev/sdb1 of the raid

it fails with a Kernel crash

-----

macemoneta 03-04-2011 11:57 AM

You want the BIOS to boot the RAID drive. Since the RAID array is operating degraded, you want to use the second grub menu entry. Details on the kernel issue would help.

You mentioned earlier that you only had a single partition on the drive. That means that /boot is just a regular directory. Are you sure that your BIOS doesn't have a limitation on addressing? If you suspect that it does, you need a separate /boot partition at the beginning of the drive.

jlowry 03-04-2011 12:25 PM

more info
 
Quote:

Originally Posted by macemoneta (Post 4279103)
You want the BIOS to boot the RAID drive. Since the RAID array is operating degraded, you want to use the second grub menu entry. Details on the kernel issue would help.

You mentioned earlier that you only had a single partition on the drive. That means that /boot is just a regular directory. Are you sure that your BIOS doesn't have a limitation on addressing? If you suspect that it does, you need a separate /boot partition at the beginning of the drive.

The BIOS is setup to boot the second disk. Which is the RAID and is pointed to by the second grub menu entry. Second meaning the second entry but from grubs point of view entry one (1) being as they start at zero (0). Right?

Yes, I have a single partition that includes /boot as a directory. Limitations on addressing, not sure, how would you find this info? Hopefully this will not be the case as I am trying to duplicate my working file server.


Being as this is a test system, I am going to scrub it and start over. I will add to the thread when I get back to this point or at success.

jlowry 03-04-2011 05:13 PM

rebuilt system and reconfigured raid - still no joy
 
I have rebuilt the system and did a copy/paste after each of the commands. Saved them on another system. When I reboot the system (BIOS using second drive) and grub selecting the first entry for mirroring the system fails to boot:

Kernel panic - not syncing VFS: Unable to mount root fs on unknown -block(9,0)

But booting from the original entry for grub that points to root=LABEL=/1

boots and shows that I am using /dev/md0.

When I try to add /dev/sda1 ( which is the original drive ) I get:

mdadm --add /dev/md0 /dev/sda1
mdadm: Cannot open /dev/sda1: Device or resource busy

swapon -s shows that both /dev/sda2 and /dev/sdb2 are being used correctly

Not sure where to go from here.

here is the copy/paste of what I have done:
Code:

[root@kilchis /]# sfdisk -l /dev/sda

Disk /dev/sda: 121601 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

  Device Boot Start    End  #cyls    #blocks  Id  System
/dev/sda1  *      0+ 121078  121079- 972567036  83  Linux
/dev/sda2    121079  121600    522    4192965  82  Linux swap / Solaris
/dev/sda3          0      -      0          0    0  Empty
/dev/sda4          0      -      0          0    0  Empty

[root@kilchis /]# ls /dev/sd*
/dev/sda  /dev/sda1  /dev/sda2  /dev/sdb

[root@kilchis /]# df
Filesystem          1K-blocks      Used Available Use% Mounted on
/dev/sda1            942106768  3284600 890193820  1% /
tmpfs                  1029620        0  1029620  0% /dev/shm

[root@kilchis /]# more /boot/grub/menu.lst
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You do not have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /, eg.
#          root (hd0,0)
#          kernel /boot/vmlinuz-version ro root=/dev/sda1
#          initrd /boot/initrd-version.img
#boot=/dev/sda
default=0
timeout=5
splashimage=(hd0,0)/boot/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.18-164.el5)
        root (hd0,0)
        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=/1 rhgb quiet
        initrd /boot/initrd-2.6.18-164.el5.img

[root@kilchis /]# more /etc/fstab
LABEL=/1                /                      ext3    defaults        1 1
tmpfs                  /dev/shm                tmpfs  defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                  /sys                    sysfs  defaults        0 0
proc                    /proc                  proc    defaults        0 0
LABEL=SWAP-sda2        swap                    swap    defaults        0 0

[root@kilchis /]# fdisk /dev/sda

The number of cylinders for this disk is set to 121601.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
  (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): p

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

  Device Boot      Start        End      Blocks  Id  System
/dev/sda1  *          1      121079  972567036  83  Linux
/dev/sda2          121080      121601    4192965  82  Linux swap / Solaris
Command (m for help): t
Partition number (1-4): 1
Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)

Command (m for help): p

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

  Device Boot      Start        End      Blocks  Id  System
/dev/sda1  *          1      121079  972567036  fd  Linux raid autodetect
/dev/sda2          121080      121601    4192965  82  Linux swap / Solaris

[root@kilchis /]# sfdisk -d /dev/sda | sfdisk /dev/sdb
Checking that no-one is using this disk right now ...
OK

Disk /dev/sdb: 121601 cylinders, 255 heads, 63 sectors/track
Old situation:
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

  Device Boot Start    End  #cyls    #blocks  Id  System
/dev/sdb1          0      -      0          0    0  Empty
/dev/sdb2          0      -      0          0    0  Empty
/dev/sdb3          0      -      0          0    0  Empty
/dev/sdb4          0      -      0          0    0  Empty
New situation:
Units = sectors of 512 bytes, counting from 0

  Device Boot    Start      End  #sectors  Id  System
/dev/sdb1  *        63 1945134134 1945134072  fd  Linux raid autodetect
/dev/sdb2    1945134135 1953520064    8385930  82  Linux swap / Solaris
/dev/sdb3            0        -          0  0  Empty
/dev/sdb4            0        -          0  0  Empty
Successfully wrote the new partition table

Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes:  dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)

[root@kilchis etc]# mdadm --create /dev/md0 --level=1 --raid-devices=2 missing /dev/sdb1
mdadm: /dev/sdb1 appears to contain an ext2fs file system
    size=972566912K  mtime=Thu Mar  3 14:06:14 2011
mdadm: /dev/sdb1 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Thu Mar  3 11:02:55 2011
Continue creating array? y
mdadm: array /dev/md0 started.

[root@kilchis etc]# mkfs.ext3 /dev/md0
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
121585664 inodes, 243141728 blocks
12157086 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
7421 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000, 214990848

Writing inode tables: done                           
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 26 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

[root@kilchis etc]# mount /dev/md0 /mnt
[root@kilchis etc]# cp -dpRx / /mnt
cp: preserving permissions for `/mnt/var/run/cups/certs/0': Operation not supported
cp: preserving ACL for `/mnt/var/run/cups/certs/0': Operation not supported


[root@kilchis etc]# mkswap -v1 /dev/sdb2
Setting up swapspace version 1, size = 4293591 kB
[root@kilchis etc]#

[root@kilchis etc]# pwd
/mnt/etc
[root@kilchis etc]# more fstab
/dev/md0                /                      ext3    defaults,errors=remount-
ro        0 1
#LABEL=/1                /                      ext3    defaults        1 1
tmpfs                  /dev/shm                tmpfs  defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                  /sys                    sysfs  defaults        0 0
proc                    /proc                  proc    defaults        0 0
/dev/sda2        none                    swap    sw,pri=1        0 0
/dev/sdb2        none                    swap    sw,pri=1        0 0


# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You do not have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /, eg.
#          root (hd0,0)
#          kernel /boot/vmlinuz-version ro root=/dev/sda1
#          initrd /boot/initrd-version.img
#boot=/dev/sda
default=1
timeout=5
splashimage=(hd0,0)/boot/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.18-164.el5)
        root (hd0,0)
        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=/1 rhgb quiet
        initrd /boot/initrd-2.6.18-164.el5.img
title CentOS Mirror(2.6.18-164.el5)
        root (hd0,0)
        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=/dev/md0 md=0,/dev/sda1,/dev/sdb1 rhgb quiet
#      initrd /boot/initrd-2.6.18-164.el5.img
        boot
title CentOS Recovery(2.6.18-164.el5)
        root (hd1,0)
        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=0,/dev/sdb1 rhgb quiet
#      initrd /boot/initrd-2.6.18-164.el5.img
        boot


[root@kilchis grub]# grub-install /dev/sda
Installation finished. No error reported.
This is the contents of the device map /boot/grub/device.map.
Check if this is correct or not. If any of the lines is incorrect,
fix it and re-run the script `grub-install'.

# this device map was generated by anaconda
(hd0)    /dev/sda

[root@kilchis grub]# grub
Probing devices to guess BIOS drives. This may take a long time.


    GNU GRUB  version 0.97  (640K lower / 3072K upper memory)

 [ Minimal BASH-like line editing is supported.  For the first word, TAB
  lists possible command completions.  Anywhere else TAB lists the possible
  completions of a device/filename.]
grub> device (hd0) /dev/sdb
device (hd0) /dev/sdb
grub> root (hd0,0)
root (hd0,0)
 Filesystem type is ext2fs, partition type 0xfd
grub> setup (hd0)
setup (hd0)
 Checking if "/boot/grub/stage1" exists... yes
 Checking if "/boot/grub/stage2" exists... yes
 Checking if "/boot/grub/e2fs_stage1_5" exists... yes
 Running "embed /boot/grub/e2fs_stage1_5 (hd0)"...  15 sectors are embedded.
succeeded
 Running "install /boot/grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/boot/grub/stage2 /boot/grub/grub.conf"... succeeded
Done.
grub> quit


[root@kilchis grub]# pwd
/mnt/boot/grub
[root@kilchis grub]# cp -dp /mnt/etc/fstab /etc/fstab
[root@kilchis grub]# cp -dp /mnt/boot/grub/menu.lst /boot/grub


****** time to reboot *******



All times are GMT -5. The time now is 07:39 AM.