LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Problem configuring RAID 5 under Ubuntu Server 9.10 (https://www.linuxquestions.org/questions/linux-server-73/problem-configuring-raid-5-under-ubuntu-server-9-10-a-782649/)

zcruzm 01-16-2010 04:29 PM

Problem configuring RAID 5 under Ubuntu Server 9.10
 
Hi all!

I have a problem configuring a RAID server under Ubuntu 9.10 (kernel 2.6.31.17) with mdadm (v2.6.7.1). First I had some hardware issues that finally got solved by using another motherboard. Now I am dealing with the software part.

In order to ease things, I am trying to configure a RAID 5 with three partitions in one disk. I have two HD's, one IDE where the OS lies (recognized as sda), and another where I intend to build the RAID (recognized as sdb). In this second drive I have made three partitions (sdb1, sdb2 & sdb3) of the same size. For this I've used

sudo fdisk /dev/sdb

and made three partitions of the same size, then changed the type to "fd". Then format each one with

sudo mkfs.ext4 -m 0 /dev/sdb1
sudo mkfs.ext4 -m 0 /dev/sdb2
sudo mkfs.ext4 -m 0 /dev/sdb3

After that I've created the array with

sudo mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb1 /dev/sdb2 /dev/sdb3

Finally formatted it with

sudo mkfs.ext4 -m 0 /dev/md0

Everything seemed fine. All messages indicated it was OK and I was able to mount it and put some files there.

The problem came after rebooting, the array was not there anymore. Issuing

cat /proc/mdstat

gave me

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : inactive sdb[0](S)
1465138496 blocks

unused devices: <none>

Tried

sudo mdadm -A /dev/md0 --run /dev/sdb1 /dev/sdb2 /dev/sdb3

and it was claiming

[1992.78]md: could not bd_claim sdb1.
mdadm: failed to add /dev/sdb1 to /dev/md0: Device or resource busy
[1992.78]raid level 5 set md0 active with 2 out of 3 devices, algorithm 2
mdadm: /dev/md0 has been started with 2 drives (out of 3)

After stopping and rebooting, the array seemed to work, and it would mount properly. But stopping and restarting it gave all kind of weird messages, as follows.

[ 384.280302] md: could not bd_claim sdb2.
[ 384.280370] md: md_import_device returned -16
[ 384.288420] md: bind<sdb3>
[ 384.288583] md: could not bd_claim sdb1.
[ 384.288648] md: md_import_device returned -16
[ 384.303607] raid5: device sdb3 operational as raid disk 2
[ 384.304906] raid5: allocated 3178kB for md0
[ 384.305063] raid5: not enough operational devices for md0 (2/3 failed)
[ 384.305177] RAID5 conf printout:
[ 384.305185] --- rd:3 wd:1
[ 384.305194] disk 2, o:1, dev:sdb3
[ 384.305876] raid5: failed to run raid set md0
[ 384.305935] md: pers->run() failed ...
[ 384.325074] md: bind<sdb1>

The biggest problem is repeatability, because I get different errors with the same commands. Sometimes if I keep stopping and restarting the array, it will start ok with the three disks, and sometimes it will claim that one of these drives is being used. Going through the logs, I've found that sometimes it is being used by "/dev/md_d0" (ls /sys/block/sdb/sdb1/holders), which I don't know what is, how it's there and how to prevent it to be there.

Actually I intend to do a RAID 5 with 5 1.5TB disks, but I don't want to make tests on the whole setup since it's very time consuming (about 36 hours to build the array) and it seems that there is a software issue that I cannot get hold of.

Any help would be appreciated. I've already re-installed Ubuntu 9.10 a couple of times, zeroed the superblocks of the partitions, repartitioned the disks with different partition sizes (I am using 5 GB partitions to save time). I've gone through this process several times, and I really don't know how to move forward now. If RAID is about trust and reliability, this is exactly what I'm not able to get.

Regards,

Alberto

jlinkels 01-16-2010 06:45 PM

What you are trying to do, creating a RAID array on 3 partitions on the same device is highly unusable. I don't think it is forbidden by mdadm, but I am not surprised either it chokes. Why do you want to create a RAID5 array on the same device? I hope not bit errors, those are corrected internally in your hard disk. And if one partition fails, I wouldn't be surprised if you can't read from the other partitions either because the disk is stuck trying to read from that partition. And then again, 9 out of 10 disks I see fail on their controller, not on bit errors.

Anyway, you should not format the partitions before you create the RAID arrays.

Try to clean the superblocks before you create the arrays. It is a mdadm command but I forgot.

Re-installing doesn't solve a thing. It is not Windows. If you doubt your installation, install Debian Stable. Since you are using the CLI anyway this shouldn't make a difference and Debian Stable is stable.

jlinkels

zcruzm 01-17-2010 08:05 AM

I am trying the RAID on the same device just for the sake of getting used to the configuration and management. Once I realize it's stable and works seamlessly, I'll make the array with 5 disks.

jlinkels 01-17-2010 08:46 AM

If you want to know if something is running stable, you should use a common setup, not something that is exceptional to the point where it is questionable whether it conforms to specification.

Besides, RAID on Linux servers has been proven to the extreme so there is no need to wait with your eventual installation to see if it stable.

Ubuntu Desktop has not the best record for stability. I have been told this is different with Ubuntu server, but nevertheless you should check relevant forums to see if Unbuntu server is stable on RAID. If there are doubts choose another distro.

jlinkels

zcruzm 01-17-2010 01:14 PM

Quote:

Originally Posted by jlinkels (Post 3829678)
If you want to know if something is running stable, you should use a common setup, not something that is exceptional to the point where it is questionable whether it conforms to specification.

I would agree. But since mdadm works at a partition level there shouldn't be any problem, I guess. I do get a warning that "the partitions are on the same physical device, and thus a disk failure could mean the loss of data". No other warning, hence, I must assume that mdadm recognizes the situation. As said before, this is just to get used to the configuration.

Quote:

Besides, RAID on Linux servers has been proven to the extreme so there is no need to wait with your eventual installation to see if it stable.

Ubuntu Desktop has not the best record for stability. I have been told this is different with Ubuntu server, but nevertheless you should check relevant forums to see if Unbuntu server is stable on RAID.
I am not doubting it's stability as a software product, but if after rebooting the machine the array is not recognized anymore and I get spurious devices like md_d0, I have to be very carfeul of what I am doing since I am risking all data. There must be some configuration issue that I cannot detect, thus asking for help if anyone has gone through a similar situation.

I can install the whole setup (5x1.5TB), but then creating the array takes 36 hours, which I think it is just not practical at this stage.

Quote:

If there are doubts choose another distro.

jlinkels
Can you suggest any that you've had good results with?

Thanks,

Alberto

jlinkels 01-17-2010 07:23 PM

Quote:

Originally Posted by zcruzm (Post 3829927)
I can install the whole setup (5x1.5TB), but then creating the array takes 36 hours, which I think it is just not practical at this stage.

You don't have to use the entire disks, you can create some smaller partitions.

Quote:

Originally Posted by zcruzm (Post 3829927)
Can you suggest any that you've had good results with?

My own distro of course :) Debian
But seriously, I have been thoroughly messing around with RAID arrays. Bare metal restore, totally zeroing the disk and try to rebuild the array so I could put back the data, changing UUID's, removing disks from the array and re-adding them, screw up the formatting, changed partition tables, trying to break it, making disks defective, anything except physical violence and everything you never want to do on you live server.
Not once mdadm responded in an undefined way. Where it messed up, it was fully my misunderstanding, and if it was possible by mdadm specification I could restore/recreate the array. Of course there were some cases where I destructed the array beyond repair.

Three more points: you cannot boot from RAID5, but you were not doing that, were you? You can boot from RAID1 but don't forget to install GRUB on both physical disks.

If you want to install Debian use the Stable version.

You don't have to fully resync the array before testing. Even when you reboot after 2% resyncing, you should see no errors, the array should re-assemble and the resync will continue at the point where you were at rebooting.

jlinkels


All times are GMT -5. The time now is 09:42 AM.