Hi all,
I am trying to build a distributed RAID (5 or 6) with several linux servers in a LAN environment.
The goal is to make use of all the hard disks of all the serves, for safely I/O and data storage.
We do not care much about the I/O speed, since there are not many users (no frequently write/read), and only 3-5 machines.
In this case, I think RAID 5 or 6 are good enough, while RAID 1/10 waste too much storage.
On the other hand, in order to share local hard disks (block devices) with remote computers, and create RAID over networking, I decide to use RAID over NBD. Since it is simple and free.
I have done some tests using two ubuntu-14.10 desktops (let us say host0 and host1) so far. The softwares I used are:
Quote:
mdadm -- v3.3
nbd-server/client -- v3.8
|
host1 shares 4 block devices through LAN (/dev/sda{5,6,7,8}) using nbd-server, which are connected as nbd{0,1,2,3} at host0.
Then the RAID 5 system is created at host0 together with the local /dev/sda{5,6,7}:
Code:
# mdadm --create --auto=yes /dev/md0 --level=5 --raid-devices=5 --spare-devices=2 /dev/sda{5,6,7} /dev/nbd{0,1,2,3}
# mkfs.ext4 /dev/md0
# mount /dev/md0 /mnt/md
Up to now everything goes smoothly, it works very well.
However, since the electric power here is not very reliable, I tried rebooting host1, to see if the RAID can be recovered correctly.
After host1 starts, of course the file system mounted at host0:/mnt/md is now read-only. Then I umounted the RAID and checked the details:
Code:
# umount /mnt/md
# mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Wed Jan 28 09:48:10 2015
Raid Level : raid5
Array Size : 195172352 (186.13 GiB 199.86 GB)
Used Dev Size : 48793088 (46.53 GiB 49.96 GB)
Raid Devices : 5
Total Devices : 7
Persistence : Superblock is persistent
Update Time : Wed Jan 28 11:11:53 2015
State : clean, FAILED
Active Devices : 3
Working Devices : 3
Failed Devices : 4
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : host0:0 (local to host host0)
UUID : fa43d095:47309bb4:4beaccca:fde903a4
Events : 23
Number Major Minor RaidDevice State
0 8 5 0 active sync /dev/sda5
1 8 6 1 active sync /dev/sda6
2 8 7 2 active sync /dev/sda7
6 0 0 6 removed
8 0 0 8 removed
3 43 0 - faulty /dev/nbd0
5 43 32 - faulty /dev/nbd2
6 43 48 - faulty /dev/nbd3
7 43 16 - faulty /dev/nbd1
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 nbd1[7](F) nbd3[6](F) nbd2[5](F) nbd0[3](F) sda7[2] sda6[1] sda5[0]
195172352 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/3] [UUU__]
unused devices: <none>
These are what we may expected. In this case, I tried to assemble the RAID:
Code:
# mdadm --stop /dev/md0
# mdadm --assemble --force /dev/md0 /dev/sda{5,6,7} /dev/nbd{0,1,2,3}
mdadm: clearing FAULTY flag for device 5 in /dev/md0 for /dev/nbd2
mdadm: clearing FAULTY flag for device 6 in /dev/md0 for /dev/nbd3
mdadm: Marking array /dev/md0 as 'clean'
mdadm: /dev/md0 assembled from 3 drives and 2 spares - not enough to start the array.
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : inactive sda5[0](S) nbd3[6](S) nbd2[5](S) nbd1[7](S) nbd0[3](S) sda7[2](S) sda6[1](S)
341563993 blocks super 1.2
unused devices: <none>
# mdadm --examine /dev/sda{5,6,7} /dev/nbd{0,1,2,3}
*** All the states are 'clean' ***
Array State : AAA..
Array State : AAA..
Array State : AAA..
Array State : AAAAA
Array State : AAAAA
Array State : AAAAA
Array State : AAAAA
Absolutely I failed to re-assemble the RAID. And mdadm --manage does not work at all.
However, I find re-creating the RAID works:
Code:
# mdadm --create --auto=yes /dev/md0 --level=5 --raid-devices=5 --spare-devices=2 /dev/sda{5,6,7} /dev/nbd{0,1,2,3}
I have check and all the data is there.
Sorry for the boring details. Here comes my questions:
1. Is re-creating the RAID always safe?
2. How can I re-assemble (or recover) the RAID after rebooting of one node? I did not write anything to the devices, and in principle all the data should be there, right?
BTW, RAID 6 is a solution for this case, which can rebuild 2 failed disks. However, it does not solve my problem, since all the nodes may shutdown at the same time due to tripping of the eclectric power.
Thank you very much for any comments and suggestions!